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We consider the problem of coloring the vertices of a large sparse random graph with a given 
number of colors so that no adjacent vertices have the same color. Using the cavity method, we 
present a detailed and systematic analytical study of the space of proper colorings (solutions). 

We show that for a fixed number of colors and as the average vertex degree (number of constraints) 
increases, the set of solutions undergoes several phase transitions similar to those observed in the 
mean field theory of glasses. First, at the clustering transition, the entropically dominant part of the 
phase space decomposes into an exponential number of pure states so that beyond this transition 
a uniform sampling of solutions becomes hard. Afterward, the space of solutions condenses over a 
finite number of the largest states and consequently the total entropy of solutions becomes smaller 
than the annealed one. Another transition takes place when in all the entropically dominant states 
a finite fraction of nodes freezes so that each of these nodes is allowed a single color in all the 
solutions inside the state. Eventually, above the coloring threshold, no more solutions are available. 
We compute all the critical connectivities for Erdos-Renyi and regular random graphs and determine 
their asymptotic values for large number of colors. 

Finally, we discuss the algorithmic consequences of our findings. We argue that the onset of 
computational hardness is not associated with the clustering transition and we suggest instead that 
the freezing transition might be the relevant phenomenon. We also discuss the performance of a 
simple local Walk-COL algorithm and of the belief propagation algorithm in the light of our results. 

PACS numbers: 89.20.Ff, 75.10.Nr, 05.70.Fh, 02.70.-c 

I. INTRODUCTION 

Graph coloring is a famous yet basic problem in combinatorics. Given a graph and q colors, the problem consists 
in coloring the vertices in such a way that no connected vertices have the same color [lj. The celebrated four- 
colors theorem assures that this is always possible for planar graphs using only four colors J2|. For general graphs, 
however, the problem can be extremely hard to solve and is known to be NP-complete [3[, so that it is widely 
believed that no algorithm can decide in a polynomial time (with respect to the size of the graph) if a given arbitrary 
instance is colorable or not. Indeed, the problem is often taken as a benchmark for the evaluation of the performance 
of algorithms in computer science. It has also important practical application as timetabling, scheduling, register 
allocation in compilers or frequency assignment in mobile radios. 

In this paper, we study colorings of sparse random graphs [HQ. Random graphs are one of the most fundamental 
source of challenging problems in graph theory since the seminal work of Erdos and Renyi Q in 1959. Concerning 
the coloring problem, a crucial observation was made by focusing on typical instances drawn from the ensemble 
of random graphs with a given average vertex connectivity c, as c increases a threshold phenomenon is observed. 
Bellow a critical value c s a proper coloring of the graph with q colors exists with a probability going to one in 
the large size limit, while beyond c s it does not exist in the same sense. This sharp transition also appears in 
other Constraint Satisfaction Problems (CSPs) such as the satisfiability of Boolean formulae ill The existence 
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of the sharp COLorable/UNCOLorable (COL/UNCOL) transition was partially [85j proven in [7|, and computing 
rigorousl y it s precise location is a major open problem in graph theory. Many upper and lower bounds were established 
010,111 El, EH [3 EB] for Erdos-Renyi and regular random graphs. 

It was also observed empirically fl6l [r7| that deciding colorability becomes on average much harder near to the 
coloring threshold c s than far away from it. This onset of computational hardness cannot be explained only by the 
simple fact that near to the colorable threshold the number of proper colorin gs is small [l8[. Some progress in the 
theoretical understanding has been done by the analysis of search algorithms [l9l [20j . For the coloring problem, it 
was proven plj that a simple algorithm g-colors almost surely in linear time random graphs of average connectivity 
c < q\ogq — 3q/2 for all q > 3 (see [2l[ for references on previous works). For 3-coloring the best algorithmic lower 
bound is c = 4.03 (To| . An important and interesting open question [22| is the existence of an e > and of a polynomial 
algorithm which q-colors almost surely a random graph of connectivity c = (1 + e)qlogq for arbitrary large q. 

The sharp coloring threshold and the onset of hardness in its vicinity has also triggered a lot of interest within the 
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FIG. 1: A sketch of the set of solutions when the average connectivity (degree) is increased. At low connectivities (on the 
left), all solutions are in a single cluster. For larger c, clusters of solutions appear but the single giant cluster still exists and 
dominates the measure. At the dynamic/clustering transition Cd, the phase space slits in an exponential number of clusters. At 
the condensation/Kauzmann transition c c , the measure condenses over the largest of them. Finally, no solutions exist above the 
COL/UNCOL transition c s . The rigidity /freezing transition c r (which might come before or after the condensation transition) 
takes place when the dominating clusters start to contain frozen variables (dominating clusters is a minimal set of clusters such 
that it covers almost all proper colorings) . The clusters containing frozen variables are colored in black and those which do not 
are colored in grey. 



statistical physics community following the discovery of a close relation between constraint satisfaction problems and 
spin glasses [23|, [24j. In physical terms, solving a CSP consists in finding groundstates of zero energy. The limit of 
infinitely large graphs corresponds to the thermodynamic limit. In the case of the q— c oloring problem, for instance, 
one studies the zero temperature behavior of the anti-ferromagnetic Potts model 12511 . Using this correspondence, 
a powerful heuristic tool called the cavity method has been developed [U H(| [U l28| : it allows an exact analytical 
study of the CSPs on sparse random graphs. Unfortunately, some pieces are still missing to make the cavity method 
a rigorous tool although many of its results were rigorously proven. The cavity method is equivalent to the famous 
replica method [29( (and unfortunately for clarity, it has also inherited some of its notations, as we shall see) . 

The cavity method and the statistical physics approach have been used to study the g-coloring of random graphs 
in [3(| HH, H2] • The coloring threshold c s was calculated [30ll , the self-consistency of the solution checked [3l| and the 
large q asymptotics of the coloring threshold computed [3l|. These results are believed to be exact although proving 
their validity rigorouslyremains a major subject in the field. Nevertheless — as the results obtained for the random 
satisfiability problem [28|, [HI, — they agree perfectly with rigorous mathematical bounds (5, lijl. Hoi. [Til. H2I [T3L flU . 
and with numerical simulations. The coloring threshold is thus fairly well understood, at least at the level of cavity 
method. 

A maybe more interesting outcome of the statistical physics analysis of the CSPs was the identification of a new 
transition which concerns the structure of the set of solutions, and that appears before the coloring threshold 0, 
[H, [H[: while at low connectivities all solutions are in a single pure state (cluster) 86], the set of solutions splits 
in an exponential number of different states (clusters) at a connectivity strictly smaller than c s . Roughly speaking, 
clusters are groups of near-by solutions that are in some sense disconnected from each other. Recently, the existence 
of the clustered phase was proven rigorously in some cases for the satisfiability problem [3|| |37|. A major step was 
made by applying the cavity equations on a single instance: this led to the development of a very efficient message- 
passing algorithm called Survey Propagation (SP) that was originally used for the satisfaction problem in [2~ij and later 
adapted for the coloring problem in [301 ] . Survey propagation allows one to find solutions of large random instances 
even in the clustered phase and very near to the coloring threshold. 

Despite all this success, the cavity description of the clustered phase was not complete in many aspects, and a 
first improvement has been made with the introduction of a refined zero temperature cavity formalism that allows 
a more detailed description of the geometrical properties of the clusters (38l . |39| . We pursue in this direction and 
give for the first time a detailed characterization of the structure of the set of solutions. We observed in particular 
that the clustering threshold was not correctly computed, that other important transitions were overlooked and 
the global picture was mixed up. The corrected picture that we describe in this paper is the following: when the 
connectivity is increased, the set of solutions undergoes several phase transitions similar to those observed in mean 
field structural glasses (we sketch these successive transitions in fig.[T]). First, the phase space of solutions decomposes 
into an exponential number of states which are entropically negligible with respect to one large cluster. Then, at the 
clustering threshold Cd, even this large state decomposes into an exponential number of smaller states. Subsequently, 
above the condensation threshold c c , most of the solutions are found in a finite number of the largest states. Eventually, 
the connectivity c s is reached beyond which no more solutions exist. Another important transition, that we refer to 



3 



as rigidity, takes place at c r when a finite fraction of frozen variables appears inside the dominant pure states (those 
containing almost all the solutions). All those transitions are sharp and we computed the values of the corresponding 
critical connectivities. 

A nontrivial ergodicity breaking takes place at the clustering transition, in consequence uniform sampling of solutions 
becomes hard. On the other hand, clustering itself is not responsible for the hardness of finding a solution. Moreover, 
until the condensation transition many results obtained by neglecting the clustering effect are correct. In particular for 
all c < c c : i) the number of solutions is correctly given by the annealed entropy (and, for general CSP, by the replica 
symmetric entropy), and ii) simple message passing algorithm such as Belief Propagation (BP) [13, El converges to 
a set of exact marginals (i.e. the probability that a given node takes a given color). In consequence we can use BP 
plus a decimation-like strategy to find proper colorings on a given graph. Finally we give some arguments to explain 
why the rigidity transition is a better candidate for the onset of computational hardness in finding solutions. 

Our results are obtained within the one-step replica symmetry breaking approach, and we believe (and argue 
partially in section HvT) . that our results would not be modified by considering further steps of RSB (as opposed to 
previous conclusions [31j). 

A shorter and partial version of our results, together with a study of similar issues in the satisfiability problem, was 
already published in [12] • We refer to [43| for a detailed discussion of the satisfiability problem. The paper is organized 
as follows: In section UH we present the model. In section IIII1 we introduce the cavity formalism at the so-called 
replica symmetric level, and discuss in detail why and where this approach fails. In section[|V]we take into account the 
existence of clusters of solutions and employ the so called one-step replica-symmetry breaking formalism to describe 
the properties of clusters. The results for several ensembles of random graphs are then presented in section [V] We 
finally discuss the algorithmic implications of our findings in section IVT1 and conclude by a general discussion. Some 
appendixes with detailed computations complete the paper. 

II. THE MODEL 
A. Definition of the model 

For the statistical physics analysis of the q— coloring problem [3(| [32|, [44| we consider a Potts [25[ spin model with 
anti-ferromagnetic interactions where each variable Si (spin, node, vertex) is in one of the q different states (colors) 
s = 1, . . . , q. Consider a graph G = (V, £ ) defined by its vertices V = {1, . . . , N} and edges (i, j) £ £ which connect 
pairs of vertices i,j € V; we write the Hamiltonian as 

H({s})= Y, S ( s " s i)- « 

With this choice there is no energy contribution for neighbors with different colors, but a positive contribution 
otherwise. The ground state energy is thus zero if and only if the graph is g-colorable. The Hamiltonian leads to a 
Gibbs measure [45[ over configurations (where (3 is the inverse temperature) : 

/*({•}) = ^c-^ (W) , (2) 
In the zero temperature limit, where (3 — > oo, this measure becomes uniform over all the proper colorings of the graph. 

B. Ensembles of Random Graphs 

We will consider ensembles of graphs that are given by a degree distribution Q(k). The required property of Q(k) 
is that its parameters should not depend on the size of the graph. All the analytical results will concern only very 
large sparse graphs (N — > ooh Provided the second and higher moments of Q(k) do not diverge, such graphs are 
locally tree-like in this limit 0, [j| . More precisely, call a d-neighborhood of a node i the set of nodes which are at 
distance at most d from i. For d arbitrary but finite the d- neighbor hood is almost surely a tree graph when N — > oo. 
This property is connected with the fact that the length £ of the shortest loops (up to a finite number of them) scales 
with the graph size as I ~ log(iV). We will consider the two following canonical degree distribution functions: 

(i) Uniform degree distribution, Q(k) = S(k — c), corresponding to the c-regular random graphs, where every vertex 
has exactly c neighbors. 
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(ii) Poissonian degree distribution, Q(k) — e~ c c k /kl, corresponding to the Erdos and Renyi (ER) random graphs 
A simple way to generate graphs with N vertices from this ensemble is to consider that each link is present with 
probability c/(N — 1). The binomial degree distribution converges to the Poissonian in the large size limit. 

It will turn out that the cavity technics simplify considerably for the regular graphs. However, ER graphs have the 
advantage that their average connectivity is a real number that can be continuously tuned, which is obviously very 
convenient when one wants to study phase transitions. It is thus useful to introduce a third ensemble, which still has 
the computational advantage of regular graphs, but that at the same time gives more freedom to vary the connectivity: 

(hi) bi- regular random graphs, where nodes with connectivity c\ are all connected to nodes with connectivity C2, and 
vice-versa. There are thus two sets of nodes with degree distributions Q(k) = S(k — c%) and Q(k) = S(k — C2). 

Notice that these graphs are bipartite by definition and therefore have always a trivial 2-coloring which we will have 
to dismiss in the following. This can be easily done within the cavity formalism (it is equivalent to neglecting the 
ordered "crystal" phase in glass models [III). 

In the first two cases, the parameter c plays the role of the average connectivity, c = ^2ukQ(k). In the cavity 
approach, a very important quantity is the excess degree distribution Qi(fc), i.e. the distribution of the number of 
neighbors, different from j, of a vertex i adjacent to a random edge (ij): 

Ql(fc)= (* + PQ(* + P . (3) 

c 

This distribution remains Poissonian for Erdos-Renyi graphs, whereas the excess degree is equal to c — 1 in the case 
of regular graphs. In the bi-regular case, there are two sets of nodes with excess degrees cl — 1 and c2 — 1. 



III. THE CAVITY FORMALISM AT THE REPLICA SYMMETRIC LEVEL 



We start by reviewing the replica symmetric (RS) version of the cavity method [26l . |27| and its implications for the 
coloring problem. In the last part of the section we show when, and why, the RS approach fails. 



A. The replica symmetric cavity equations 

The coloring problem on a tree is solved exactly by an iterative method called the belief propagation algorithm [4(| 
(note some boundary conditions have to be imposed, otherwise a tree is always 2-colorable) that is equivalent to the 
replica symmetric cavity method [4l[ . At this level, the method is in fact a classical tool in statistical physics to deal 
with that tree structure that dates back to the original ideas of Bethe, Peierls and Onsager [47| . It allows one to 
compute the marginal probabilities that a given node takes a given color and, in the language of statistical physics, 
observables like energy, entropy, average magnetization, etc. The applicability of the method goes however beyond 
tree graphs and we will discuss when it is correct for random locally tree-like graphs in section fill CI 
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FIG. 2: Iterative construction of a tree by adding a new Potts spin. 



Let us now describe the RS cavity equations. Denote "017*"' the probability that the spin i has color when the 
edge (ij) is not present and consider the iterative construction of a tree in fig. [5] The probability follows a recursion: 

=^11 £e-^-<- = ^ J! (1-(1- e-')<*) • (4) 
^0 fcei-j Sk ^0 kei-j 
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where Zq~* j is a normalization constant (the cavity partition sum) and j3 the inverse temperature. The notation 
k £ i — j means the set of neighbors of i except j. The normalization Z^ 3 is related to the free energy shift after 
the addition of the node i and the edges around it, except (ij), as 

Zy j = e-P AFi ^ . (5) 
In the same way, the free energy shift after the addition of the node i and all the edges around it is 

e-'A* = Zi = 53 II (! - C 1 - • (6) 

The is at the same time the normalization of the total probability that a node i takes color s, (the marginal of z): 

<=4n( i -( i - e ~ /3 )<*)- w 

The free energy shift after the addition of an edge («j) is 

e -0AFV = Z ij = 1 _ (1 _ £ V , rVr ,. (8) 

S 

The free energy density in the thermodynamic limit is then given by (see for instance (26j ) 

fW = 7r (E AFi -E AFiJ ) • ( Q ) 

\ i ij / 

Note that this relation for the free energy is variational, i.e. that if one differentiates with respect to /3, then only the 
explicit dependence needs to be considered (see [lH for details). The energy (the number of contradictions) and the 
corresponding entropy (the logarithm of the number of solutions with a given energy) densities can be then computed 
from the Legendre transform 

-/3/03) = -/3e + s(e), (10) 

were / = F/N, e = E/B and s = S/N are intensive variables. 

The learned reader will notice that in some previous works using the cavity method [3(| HH , these equations were 
often written for a different object than the probabilities ip. Instead, the so-called cavity magnetic fields h and biases 
u where considered. The two approaches are related via 

kei-j 

Strictly speaking the i/j are "cavity probabilities" while h and u are "cavity fields", however tp are sometimes also 
referred to as cavity fields (or messages) in the literature, and the reader will thus forgive us if we do so. 

Note that each of the two notations is suitable for performing the zero temperature limit in a different way: in the 
first one we fix the energy zero and we obtain the zero temperature BP recursion which gives the marginal probabilities 
for each variable, while in the second case, we obtain a simpler recursion called Warning Propagation (|49p that deals 
with the energetic contributions but neglects the entropic ones [HI, [3(| • This is the origin of the discrepancy between 
the RS results of refs. [13] and [30]. As we shall see, the differences between these two limits will be an important 
point in the paper. 



B. Average over the ensemble of graphs and the RS solution 

To compute (quenched) averages of observables over the considered ensemble of random graphs given by the degree 
distribution Q(k) we need to solve the self-consistent cavity functional equation 

k-i . 

= E Q i( fc ) II / d^vms^ - (12) 

k 1=1 •* 
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where Q\(k) is the distribution of the number of neighbors given that there is already one neighbor, and the function 
J r ({ , 1 }) is given by eq.(|l]). Beware that ip here is a g-components vector while we omit the vector notation to lighten 
the reading. This equation is quite complicated since the order parameter is nontrivial, but we can solve it numerically 
using the population dynamics method described in [26l . l27j . 

Throughout this paper we will search only for the color symmetric functions V(ip), i.e. invariant under permutation 
of colors. Clearly with this assumption we might miss some solutions of (fl2|) . Consider for example q > 2 colors and 
the ensemble of random bi-regular graphs. Since every bipartite graph is 2-colorable there are q(q — 1)/2 corresponding 
color asymmetric solutions for V(ip). For the ensemble of random graphs considered here, we later argue that this 
assumption is however justified. 

Another important observation is that for regular graphs the equation (|12p crucially simplifies: the solution factor- 
izes [HI in the sense that the order parameter ip is the same for each edge. This is due to the fact that, locally, every 
edge in such a regular graph has the same environment. All edges are therefore equivalent and thus the distribution 
V(4>) has to be a delta function. For the bi-regular graphs, the solution of (jT2j) also factorizes, but the two sets of 
nodes of connectivity ci and C2 (each of them being connected to the other) have to be considered separately. 

It is immediate to observe that V(ijj) — 5(ij] — 1/q) (i.e. each of the q components of each cavity field ijj equals 1/q), 
is always a solution of (|12p . By analogy with magnetic systems we shall call this solution paramagnetic. Numerically, 
we do not find any other solution in the colorable phase. For regular random graphs the paramagnetic solution 
is actually the only factorized one. The number of proper colorings predicted by the RS approach is thus easy to 
compute. Since all messages are of the type V(ip) = S(ip — 1/q), the free energy density simply reads 



c 



1-e-f 3 



/3/ RS =log g +-log^l — ]. (13) 



The entropy density at zero temperature thus follows 



srs = log q + I log ^1 - i J . (14) 

It coincides precisely with the annealed (first moment) entropy. We will see in the following that, surprisingly, the 
validity of this formula goes well beyond the RS phase (actually until the so-called condensation transition). 



C. Validity conditions of the replica symmetric solution 

We used the main assumption of the replica symmetric approach when we wrote eq. (j4]): we supposed that the 
cavity probabilities ip^ 1 for the neighbors k of the node i are "sufficiently" independent in absence of the node i, 
because only then the joint probability factorizes. This assumption would be true if the lattice were a tree with 
non-correlated boundary conditions, but loops, or correlations in the boundaries, may create correlations between the 
neighbors of node i (in absence of i) and the RS cavity assumption might thus cease to be valid in a general graph. 
The aim here is to precise and quantify this statement both from a rigorous and heuristic point of view. 



1. The Gibbs measure uniqueness condition 



Proving rigorously the correctness of the RS cavity assumption for random graphs is a crucial step that has not yet 
been successfully overcome in most cases. The only success so far was obtained by proving a far too strong condition: 
the Gibbs measure uniqueness [4^. [50L l5ll |52|. Roughly speaking: the Gibbs measure © is unique if the behavior 
of a spin i is totally independent from the boundary conditions (i.e. very distant spins) for any possible boundary 
conditions. More precisely, let us define {s;} colors of all the spins at distance at least I from the spin i. The Gibbs 
measure fj, is unique if and only if the following condition holds for every i (and in the limit N — > 00) 



E 



sup 



M(si|{si})-M(si|{sJ})| 



(15) 



where the average is over the ensemble of graphs. In [5CJ, l5l| , it was proven that the Gibbs measure in the coloring 
problem on random regular graphs is unique only for graphs of degree c < q. To the best of our knowledge, this has 
not been computed for Erdos-Renyi graphs (later in this section, we argue on the basis of a physical argument that 
it should be c < q — 1 in this case). 
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2. The Gibbs measure extremality condition 



In many cases, the RS approach is observed to be correct beyond the uniqueness threshold. It was thus suggested 
in [53[ (see also [42[ ) that the Gibbs measure extremality provides a proper criterion for the correctness of the replica 
symmetric assumption. Roughly speaking, the difference between uniqueness and extremality of a Gibbs measure is 
that although there may exist boundary conditions for which the spin i is behaving differently than for others, such 
boundary conditions have a null measure if the extremality condition is fulfilled. Formally (and keeping the notations 
from the previous section), the extremality corresponds to 



E 



{Si} 



«i=l 



(16) 



In mathematics the "extremal Gibbs measure" is often used as a synonym for a "pure state" . Recently, the authors 
of [53l | provided rigorous bounds for the Gibbs measure extremality of the coloring problem on trees. 

There exist two heuristic equivalent approaches to check the extremality condition. In the first one, one studies 
the divergence of the so called "point-to-set" correlation length [HJ HI] • The second one is directly related to the 
cavity formalism: one should check for the existence of a nontrivial solution of the one-step replica symmetry breaking 
equations (1RSB) at m = 1 (see section|V]). Both these analogies were remarked in [53| and exploited in [453]. We will 
show in section [V] that the extremality condition ceases to be valid at the clustering threshold c<j, beyond which the 
1RSB formalism will be needed. 



3. The local stability: a simple self- consistency check 

A necessary, simple to compute but not sufficient, validity condition for the RS assumption is the non-divergence 
of the spin glass susceptibility (see for instance [HI). If it diverges, a spin glass transition occurs, and the replica 
symmetry has to be broken J29|. The local stability analysis thus gives an upper bound to the Gibbs extremality 
condition, which remarkably coincides with the rigorous upper bound of [57| . This susceptibility is defined as 

The connectivities above which it diverges at zero temperature can be computed exactly within the cavity formalism 
(we refer to appendix [X] for the derivation). It follows for regular and Erdos-Renyi graphs: 

4 e | = q 2 - 2q + 2 , c|f = q 2 - 2q + 1 , (18) 

while the stability of the bi-regular graphs of connectivities c\ , C2 is equivalent to regular graphs with c = 1 + 

V(Cl - 1)(C2 " 1). 

Note that for regular and ER grap hs the RS instability threshold is in the colorable phase only for q = 3. Indeed the 
5-regular graphs are 3-colorable [3ll| (and rigorous results in [TH, EH) and exactly critical since %|(3) = 5, and for ER 
graphs the COL/UNCOL transition appears at c s rs 4.69 [H| while cgf (3) = 4. This means that the replica symmetry 
breaking transition appears continuously at the point crs so that above it the RS approach is not valid anymore. For 
all q > 4, however, the local stability point is found beyond the best upper bound on the coloring threshold for both 
regular and ER graphs. In this case, the extremality condition will not be violated by the continuous mechanism, but 
we will see that, instead, a discontinuous phase transition, as happens in mean field structural glasses, will take place. 

Interestingly enough, a similar computation can be made for the ferromagnetic susceptibility \f = j? 2j j( s i s j)c 
(see again appendix It diverges at c = q for regular graph and c = q — 1 for Erdos-Renyi graphs. This divergence 
(called modulation instability in [56]) would announce the transition towards an anti- ferromagnetic ordering on a tree, 
which is however incompatible with the frustrating loops in a random graph (although such order might exist on the 
bi-regular graphs). This is precisely the solution we dismiss when considering only the color symmetric solution of 
(TTZ)) . Note however that the presence of this instability shows that the problem ceases to a have a unique Gibbs 
state (although it is still extremal) as for some specific (and well-chosen) boundary conditions, an anti-ferromagnetic 
solution may appear. Indeed it coincides perfectly with the rigorous uniqueness condition c = q for regular graph, 
and suggests strongly that the uniqueness threshold (or at least an upper bound) for Erdos-Renyi graphs is c = q— 1. 
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IV. ONE-STEP REPLICA SYMMETRY BREAKING FRAMEWORK 



So far we described the RS cavity method for coloring random graphs and explained that the extremality of the Gibbs 
measure gives a validity criterion. We now describe the one-step replica symmetry breaking cavity solution [26l |2^|. 
In this approach, the non-extremality of the Gibbs measure is cured, by decomposing it into several parts (pure states, 
clusters) in such a way that within each of the states the Gibbs measure becomes again extremal. 

This decomposition has many elements/states, not just a finite numbers like the g-states of the usual ferromagnetic 
Potts models. It is actually found that the number of pure states is growing exponentially with the size of the system. 
Let us define the state-entropy function £(/) — called the complexity — which is just the logarithm of the number of 
states with internal free energy density /, i.e. J\f(f) = exp[iVE(/)]. In the glass transition formalism, this complexity 
is usually referred to as the configurational entropy. Dealing with exponentially many pure states is obviously a 
nightmare for all known rigorous approaches to the thermodynamic limit. The heuristic cavity method overcomes 
this problem elegantly, as was shown originally in the seminal work of [26l [27j ■ 

Another very useful intuition about the 1RSB cavity method comes from the identification of states a with the 
fixed points {ip} of the belief propagation equations ([!]). The goal is thus to compute the statistical properties of these 
fixed points. Each of the states is weighted by the corresponding free energy © to the power to, where m is just 
a parameter analogous to the inverse temperature [3 (in the literature m is often called the Parisi replica symmetry 
breaking parameter [2^, [5^]). The probability measure over states {tp} is then 

A(W} ) = Zo(W) m = 1 e -/3miV/( W ) ; (19) 

Z\ Z\ 

where Z\ is just the normalization constant. To write the analog of the belief propagation equations we need to define 
the probability (distribution) '(i/j 1 ^ ) of the fields tp 1 ^^ . This can be obtained from those of incoming fields as 



n /^r ip ^ i (V' s fc ^)^r i -^({^ ^ }))(^) , 

1 k^i—j 



i 



y , , 5{^-T)(zyi) m . (20) 
Z x jpop v ' 

The function T is given by eq. ^ and the delta function ensures that the set of fields is a fixed point of the 

/ . .\ vri 

belief propagation ((4|. The re- weighting term 3 ) takes into account the change of the free energy of a state 

after the addition of a cavity spin i and its adjacent edges except (ij), as defined in eq. ([5]). This term appears for the 
same reason as a Boltzmann factor e~P Ssi ' s >= in eq. (j4|): it ensures that the state a is weighted by (Z a ) m in the same 
way a configuration {s} is weig hted by e ~ m{{s}) in ©. Finally Z\^ 3 is a normalization constant. In the second line 
of (|20|) we introduced an abbreviation that will be used from now on to make the equations more easily readable. 
The notation POP comes from "population dynamics" which refers to the numerical method we use to solve eq. ()20[) . 
The probability distribution P(tp) can be represented numerically by a set of fields taken from P(ip), and then the 
probability measure P(ip)dip becomes uniform sampling from this set, for more details see appendix [Dl 
We define the "replicated free energy" and compute it in analogy with eq. ([9]) as 



where 



g-^raAi' _ / e ~pmAF i e -/3mA#" _ / e -0mAF ij ^22) 
iPOP JPOP 

Putting together p^|) and (f2"Tj) we have 

Z x = e -0mN9(0,m) ^^2 e -fimNJ{{^\) = f d j g -JV [,3m/(/3)-S(/)] ^ 

w f 

where the sum over {ip} is over all states (or BP fixed points). In the interpretation of [59] m is the number of replicas 
of the system, thus the name "replicated free energy" for $(/3, m). Note that we are using the word "replica" only 
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to refer to the established terminology as no replicas are needed within the cavity formalism. From the saddle point 
method, it follows that the Legendre transform of complexity function £(/) gives the replicated free energy <E>(?n) 

-/3m$(/3,m) = -/3m/03) + £(/). (24) 

Notice that this equation is correct only in the highest order in the system size N, i.e. in densities and at the 
thermodynamic limit. From the properties of the Legendre transform we have 

£ = /3m 2 9 m $(/3, m) , f = a m [m$(fi,m)], 0m = d f E(f). (25) 

Thus, from eq.(|2"Tj). the free energy reads 

ffff) = V j PQP V jpoP (26) 

i JPOP e jj JPOP e 

When the parameter m is equal to one (the number of replicas is actually one in the approach of [59l]), then 
— f3&(/3, 1) = — /3/(/3) + £(/) reduces to the usual free energy function considered in the RS approximation 

HP, 1) = e - = e - Tstot , (27) 

where s in the internal entropy density of the corresponding clusters and stot the total entropy density of the system. 



A. Analyzing the 1RSB equations 

Combining (f2"Tj) and (|26[) we can compute £ and / for each value m, that gives us implicitly £(/). To compute 
the thermodynamic observables in the model we have to minimize the total free energy / tot = — £//3 over such 
values of / where the complexity £(/) is non- negative (so that the states exist in the thermodynamic limit). The 
minimum of the total free energy corresponds to a value of parameter m = m* and states with the free energy /* 
dominate the thermodynamics. Three different cases are then observed: 

a) If there is only the trivial (replica symmetric) solution at m = 1, then the Gibbs measure ^ is extremal and 
the replica symmetric approach is correct. If at the same time a nontrivial solution exists for some 1, then 
the clusters corresponding to this solution have no influence on the thermodynamics. 

b) If there is a nontrivial solution at m = 1 with a positive complexity, then m* = 1 minimizes the total free 
energy. The system is in a clustered phase with an exponential number of dominating states. 

c) If however the complexity is negative at m — 1, then the corresponding states are absent with probability 
one in the thermodynamic limit. Instead the total entropy is dominated by clusters corresponding to m* such 
that £(m*) = 0: the system is in a condensed phase. Note that the condition £(m*) = corresponds to the 
maximum of the replicated free energy (|2ip . 

The transitions between these cases are well known in structural glass phenomenology where they appear when the 
temperature is lowered [60l . l6l| . The transition from the paramagnetic phase to the clustered one is usually referred 
to as the dynamical transition [62| or the clustering transition. It is not a true thermodynamic transition as the total 
free energy of the system at rn* = 1 is still equal to the replica symmetric one ([9]) (see appendix [C| and thus is an 
analytical function of connectivity. However, the phase space is broken into exponentially many components and, as 
a consequence, the dynamics fall out-of-equilibrium beyond this transition. 

The second transition from the clustered to the condensed phase is, however, a genuine thermodynamic transition 
(the free energy has a discontinuity in the second derivative at c c ) and is called the replica symmetry breaking 
transition, or the static glass transition. At this point the measure condenses into few clusters, and we shall call it 
the condensation transition. In structural glasses, it corresponds to the well known Kauzmann transition [63| . The 
sizes of the clusters in the condensed phase follow the so called Poisson-Dirichlet process which is discussed shortly 
in appendix [Bl 

The procedure to compute the replicated free-energy (|2"Tj) and the related observables was described above for a 
single large random graph. To compute the averages over the ensemble of random graphs, we need to solve an equation 
analogous to eq. p2|) 

k-i . 

P[P(^)]=^Qi(fc)n / dP\^)V[P\^)]5(P^)-T 2 {{P*W)})), (28) 

k i=l 
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where the functional T% is given by eq. ((20)) . Solving this equation for a general ensemble of rando m g raphs and a 
general parameter m is a numerically quite tedious problem. In the population dynamics algorithm [26l . |27| we need 
to deal with a population of populations of q-components fields. It is much more convenient to look at the ensemble 
of random regular graphs where a factorized solution V[P(i/j)] = 6(P(i/j) — Po(VO) must exists. Then we are left with 
only one functional equation ([20]) . 

Before discussing the zero temperature limit, we would like to point out that there exists another very important 
case in which eq. (|28p simplifies. For m = 1, as first remarked and proved in (53|, when dealing with the problem of 
reconstruction on trees, the equations can be written (and numerically solved) in a much simpler way. We again refer 
to the appendix [Cl for details. Especially for the Poissonian random graphs this simplification is very useful. 

B. Zero temperature limit 

We now consider the zero temperature limit (3 — ► oo of the 1RSB equations ((24|) -([28 |) to study the coloring problem. 
In most of the previous works [27l. l28l. |30T| the energetic zero temperature limit was employed. The (3 — > oo limit of eq. 
(I24[) was taken in such a way that m(3 = y remains constant. The replicated free energy (|24p then becomes 

-y$ e {y) = -ye+Z{e). (29) 

It is within this approach that the survey propagation (SP) algorithm was derived. The connectivity at which the 
complexity function S(e = 0) becomes negative is the coloring threshold. Above this connectivity £(e) was used to 
compute the minimal number of violated constraints (the ground state energy) . The reweighting in eq. (|20| becomes 
e -yAE 3 ^ an( j wnen y — > oo all the configurations with positive energy are forbidden. 

In this paper we adopt the entropic zero temperature limit, suggested originally in (38l . [3Ql ] . The difference in the 
two approaches was already underlined in sec. MI Al We want to study the structure of proper colorings, i.e. the 
configurations of zero energy, and we thus fix the energy to zero. Then we obtain the entropy by considering —f3f = s 
and introduce a free entropy — or in replica term a "replicated entropy" — as $ s (m) = —f3m$(f3, to)|/3^oo ■ Eq. ([24]) 
then becomes 

$ s (m) = ms + S(s). (30) 

The belief propagation update ([4]) becomes 

z o kei-j 

while the 1RSB equation ([20]) keeps the same expression (and thus the same computational complexity). 

The partition sum Zq in ([2]) becomes in this limit the number of proper colorings or solutions. The clusters are now 
sets of such solutions, and are weighted by their size to the power to. The free entropy $ s (m) is then computed as 

*.(m)=i (z^-E^j =^ (E^/ pop (^r-E^/ pop (^r] , (32) 

where AZ l and AZ U are given by eqs. ([5]) and ©. The analysis from the previous section is valid also for the entropic 
zero temperature limit. The information extracted from the number of clusters of a given size, S(s), is one of the 
main results of this paper and will be discussed and interpreted in section [VJ 

C. The role of frozen variables 

In this section we discuss the presence and the role of the frozen variables and explain the connection between the 
energetic and the entropic zero temperature limits. This allows us to revisit (and extend) the survey propagation 
equations. Remember that the components of the cavity field ipl"*-* are the probabilities that the node i takes the 
color Si when the constraint on the edge ij is not present. In the zero temperature limit we can classify them in two 
categories: 

(i) A hard field corresponds to the case when all components of V' 1 ^ are zero except one, s. Then only that color 
is allowed for the spin i, in absence of edge {ij). 
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(ii) A soft field corresponds to the case when more than one component of 4' l s~* : ' i s nonzero. The variable i is thus 
not frozen in absence of edge (ij), and the colors of all the nonzero components are allowed. 

This distinction is also meaningful for the full probabilities ip l s . if ip l s . is a hard field then the variable i is 
frozen. In the colorable region there cannot exist a finite fraction of frozen variables (even if we consider properly 
the permutational symmetry of colors) since by adding a link the connectivity changes by 1/N but the probability 
of becoming uncolorable would be finite. On the contrary, in the 1RSB picture, we observe that a finite fraction of 
variables can be frozen within a single cluster. In other words, in all the solutions that belong to this given cluster a 
finite fraction of variables can take one color only. By adding a link into the graph, the connectivity grows by 1/-/V, 
and there is a finite probability that a cluster with frozen variables disappears. The distinction between hard and soft 
fields is useful not only for the intuition about clusters, but also for the analysis of the cavity equations and it also 
leads to the survey propagation algorithm. 

The distribution of fields over states P WJ {ip*^ ) (|20|) can be decomposed into the hard- and soft-field parts 

pi-i^i-i) = J2 vV j S(^ j - r.) + rt^^W^) , (33) 

s=l 

where P WJ is the distribution of the soft fields and the normalization is X)s=o = 1- 

Interestingly, the presence of frozen variables in the entropically dominating clusters is connected to the divergence 
of the size of average minimal rearrangement (55l. I&ij. Precisely, choose a random proper coloring {s} and a random 
node i in the graph. The average minimal rearrangement is the Hamming distance to the nearest solution in which 
node i has a color different from Sj averaged over the nodes i, the proper colorings, and graphs in the ensemble. 

Another interesting role of the frozen variables arises within the whitening procedure, introduced in [65| and studied, 
between others, for the satisfiability problem in [66l. Wl\. This procedure is equivalent to the warning propagation 
(WP) update which we outlined in scc. HII Al Whitening is able to identify if a solution belongs to a cluster with 
frozen variables or not. Particularly, the result of the whitening is a set of hard cavity fields. 

Since the survey propagation algorithm is computing statistics over the states that contain hard fields, then the 
solution found after decimating the survey propagation result should a priori also contain hard fields. However, recent 
works show that if one applies the whitening procedure starting from solutions found by SP on large graphs, whitening 
converges every time to the trivial fixed point (see detailed studies for K-SAT in [6a|67[). A possible solution to this 
apparent paradox is discussed in sec. IVI1 



1. Hard fields in the simplest case, m = 

Let us now consider the survey propagation equations originally derived in [301 ] from the energetic zero temperature 
limit P^]) when y — > oo. For simplicity we will write them only for the 3-coloring. We consider the 1RSB cavity 
equation (j!?0|) for m — > 0, then the reweighting factor (Z^ 1 )™ is equal to zero when the arriving hard fields are 
contradictory, and equal to one otherwise. The update of probability rj s that a field is frozen in direction s is then 
written from eq. (j20|) : 



_ Tlkei-j (1 Vs^ 1 ") J2 P =£s Tlkei-j (^o - ** + V$~**) + Tlkei-j . ,> , , 



In the numerator there is a telescopic sum counting the probability that color s and only color s is not forbidden by 
the incoming fields. In the denominator the telescopic sum is counting the probability that there is at least one color 
which is not forbidden. If we do not want to actually find a proper coloring on a single graph but just to compute 
the replicated free energy /entropy, we can further simplify eq. (|34[) by imposing the color symmetry. Indeed, the 
probability that in a given state a field is hard in direction of a color s has to be independent of s (except s — which 
corresponds to a soft field). Then (13~i)) becomes, now for general number of colors q: 

,<~, _ w{{ ^ }) _ sx-^-P'tV) rw, [i - o + 

Note that since d~Ei(s)/ds = —m, the value m = corresponds to the point where the function S(s) has a zero 
slope. If a nontrivial solution of (|35|) exists, then £(s)| m =o is the maximum of the curve S(s) and is counting the 
total log-number of clusters of size s, which is due to the exponential dependence, also the total log-number of all 
clusters, regardless their size. There are two points that we want to emphasize: 
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• Suppose that a nontrivial solution of (|35[) exists, i.e. many clusters exist and their number can be computed 
with the energetic zero temperature limit calculations. Then the clusters might be very small and contain very 
few solution in comparison to bigger less numerous clusters; or in comparison to a giant single cluster which 
might still exist. This situation cannot be decided by the energetic formalism that weights clusters equally 
independently of their size. 

• Suppose, on contrary, that a nontrivial solution of (|35[) does not exist. It might still well be that many clusters 
exist, but the S(s) curve has no part with zero slope. 

We will see that these two cases are actually observed. The energetic method, that can locate the coloring threshold 
and from which the survey propagation can be derived, is therefore not a good tool to study the clustering transition. 



2. Generalized survey propagation recursion 

Let us compute how the fraction of hard fields 77 evolves after one iteration of equation (j20|) at general m. There 
are two steps in each iterations of (f20|) . In the first step, r] iterates via eq. (|35|) . In the second step we re- weight the 
fields. Writing P^ Td (Z) the — unknown — distribution of the reweightings Z m for the hard fields, one gets 

tf^i = dZP*r d (Z)Z m w ({ v k ^}) = Mi^pil I dZ P^ d (Z)Z m = W({? ^ }) Z^~ d . (36) 

A similar equation can formally be written for the soft fields 

1 - qwOlJ^}) 

N 



1 - vT 3 = * \V " ZZ h . (37) 



Writing explicitly the normalization TV, we finally obtain the generalized survey propagation equations: 



^ = n ^nYXTT^ Tj } LnM ,s ^nv with Kto*^}) = ■ ( 38 ) 

qw({r)^ 1 }) + [1 - ^({77*-^})] r({rf ^}) Z™ rd 

In order to do this recursion, the only information needed is the ratio r between between soft- and hard-field reweight- 
ings, which is in general difficult to compute since it depends on the full distribution of soft fields. 

There are two cases where eq. simplifies so that the hard-field recursion become independent from the soft-field 
distribution. The first case is, of course, m = then r = 1 independently of the edge {ij), and the equation reduces 
to the original SP. The second case arise for m = 1, where one can use the so-called reconstruction formalism and 
obtain again a closed set of equations. The computation is done in appendix [C] and the SP equations at m = 1 read 



+3 — ' 

q 



i 9-1 




(39) 



It would be interesting in the future to use eq. (138j) in an algorithm to find proper graph colorings, as it has been 
done with the original SP equation [301 ] . As an approximation one might also use a value r independent of the edge 
(ij), but different from one. 

For the purpose of the present work, it is important to notice that it is also possible to use eq. (|38p in the population 
dynamics to simplify the numerical evaluation of the 1RSB solution by separating the hard-field and the soft-field 
contributions. Indeed, it gives the exact density of hard fields provided the ratio r is calculated, which is doable 
numerically (see appendix |D|. This allows us to monitor precisely the hard-field density and only the soft-field part 
is given by the population dynamics. This turns out to greatly improve the precision of the numerical solution of the 
cavity equations and to considerably fasten the code. 



3. The presence of frozen variables 



A natural question is: "When are the hard fields present?" or more precisely: "When does eq. ([38]) have a nontrivial 
solution rj > 0?" First notice that in order to constrain a node into one color, one needs at least q—1 incoming fields 
that forbids all the other colors. It means that function w({r] k ^' 1 }) defined in eq. (|55|) is identically zero for k < q — 1 
and might be non-zero only for k > q — 1, where k is the number of incoming fields. 
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In the limit r — > (which corresponds to m — > — oo) eq. ((38)) gives 77 = 1/g if w({r] k ^' 1 }) is positive, and 77 = 
if w({r] k ~^ 1 }) is zero. Updating eq. (f3"5| on a given graph, from initial conditions r\ = 1/q everywhere, is equivalent 
to recursive removing of all the nodes of connectivity smaller than q. This shows that the first nontrivial solution 
with hard fields exists if and only if the g-core [68| of the graph is extensive. For regular graphs it is simply at 
connectivity c — q while for Erdos-Renyi graphs these critical connectivities can be computed exactly and read, for 
small q, C3 = 3.35, C4 = 5.14, C5 = 6.81 [68[. Indeed we see that the first nontrivial solution to the 1RSB equation 
appears much before those of the original SP equation at m = 0. 

On a regular graph, the equations further simplify as rj factorizes (is edge independent) and follows a simple 
self-consistent equation 

n = w(ri) — ... . (40) 

w; qw{r]) + [1 -qw{rj)]r K ' 

This equation can be solved for every possible ratio r so that for all c > q, we can compute and plot the curve r)(r). 
We show the results in fig. [3] for different numbers of colors q. On this plot, we observe that rj = 1/q, as predicted, 
for r = 0. It then gets smaller for larger value of the ratio and, at a critical value r cr j t , the solution disappears 
discontinuously and only the (trivial) solution rj = exists. The values r cr j t correspond to a critical value of m r . For 
all m > m r no solution containing frozen variables can exist. 




0.5 1 1.5 2 2.5 3 3.5 4 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 
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FIG. 3: The lines are solutions of eq. (|40[l and give the total fraction qrj of hard fields for a given value of ratio r = Z™ ft /^ d 
for q — 3 (left) and q = 4 (right) in regular random graphs. There is a critical value of the ratio (full point) beyond which only 
the trivial solution rj — exists. Note that the solutions at m = and m — 1 only exist for a connectivity large enough. 



D. Validity conditions of the 1RSB solution 



Now that we have discussed in detail the 1RSB formalism the next question is: "Is this approach correct?" To 
answer this question, one has to test if the Gibbs measure is extremal within the thermodynamically dominating 
pure states. This is equivalent to checking if the two-step Replica Symmetry Breaking (2RSB) solution is non trivial. 
Computing explicitly the 2RSB solution is however very complicated numerically, especially for Erdos-Renyi graphs. 
Instead, the local stability of the 1RSB solution towards 2RSB should be checked, in analogy with the RS stability in 
sec. IIII C 31 It is indeed a usual feature in spin glass physics to observe that the 1RSB glass phase become unstable 
at low temperatures towards a more complex RSB phase and this phenomenon is called the Gardner transition (69j . 

To perform the stability analysis [3l], 0, [5(| [7(| , one first writes the 2RSB recursion — where the order parameter is 
a distribution of distributions of fields on every edge Px(P2(i/))) — and then two types of 1RSB instabilities have to be 
considered depending on the way the 2RSB arises from the 1RSB solution. The first type of instability — called states 
aggregation — corresponds to <5(P(V>)) — > Pi(P2(ip)) while the second type — called states splitting — corresponds to 
P(5(ip)) — * Pi(P2{ip))- A complete stability analysis is left for future works, but it is worth to discuss the relevance 
of the results derived over the last few years [3l|, [34], [z3| • 

The 1RSB stability was studied for the coloring problem in 3l| but only for the energetic zero temperature limit 



(|2"5)) . In this case the parameter y = (3m is conjugated to the energy. The results derived in [3l| — as well as those 
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previously derived for other problems [34j, [70| — thus concern only the clusters of sizes corresponding to to — at zero 
(for y — oo) or at positive (for finite y) energy. The main result of [Mil I?! was that the 1RSB approach was stable 
in vicinity of the coloring threshold c s . As we shall see the clusters corresponding to to — are those dominating 
the total entropy at the coloring threshold and as a consequence its location is thus exact within the cavity approach. 
The states of the lowest energy (the ground states) in the uncolorable phase also correspond to to = 0, and thus 



On the other hand in the colorable phase the stability of the entropically dominating clusters that correspond to 
to > should be investigated. Some more relevant information can be, however, already drawn from known results. It 
was indeed found that the 1RSB approach at to = is type I stable for all y, and type II stable for all y > yi in vicinity 
of the coloring threshold. These results concerns the states of positive energy, but keeping in mind the interpretation 
of y as a slope in T, m diagram, we see that the clusters of zero energy corresponding to small but nonzero positive m 
and zero temperature are also stable with respect to both types of stabilities. Near the colorable threshold, the value 
of m* which describes the dominating clusters is close to zero and as a consequence all the dominating clusters are 
1RSB stable in vicinity of the coloring threshold. Far from the coloring threshold, the stability analysis of [3lL [HJ is 
irrelevant. In particular, the predictions of a full-RSB colorable phase made in [3lL 1341 [70| is not correct. Quite the 
contrary, our preliminary results indicate that all the dominating clusters are 1RSB stable for q > 3. 

The 3— coloring is however a special case as the clustering transition is continuous. Although the type II instability 
seems irrelevant in this case as well, we cannot at the moment dismiss a type I instability close to the clustering 
transition. Indeed the entropically relevant clusters correspond to values of to* close to one in this case, and it is 
simple to show that the clusters at to = 1 are type I unstable in the case there is a continuous transition: this is 
because the type I stability is equivalent to the convergence of the 1RSB update on a single graph. Since for to = 1 
the averages of the 1RSB fields satisfy the RS belief propagation equations, and since we know from the RS stability 
analysis in section Till C 31 that those equations do not converge in the RS unstable region (i.e. for c > crs = 4 in 
3-coloring of Erdos-Renyi graphs), it then follows that the 3-coloring is unstable against state aggregation at to = 1 for 
all connectivities c > 4. Therefore, it is possible that the 1RSB result for 3-coloring are only approximative for what 
concerns the number and the structure of solutions close to the clustering transition (note that the critical values for 
the phase transition are however correct and do not depend on that). This, and related issues [7l|, will be hopefully 
clarified in future works. 

To conclude, we believe that all the transition points we discuss in this paper (and those computed in the K-SAT 
problem in [42j) as well as the overall picture, are exact and would not be modified by considering further steps of 
replica symmetry breaking. 



We now solve the 1RSB equations, discuss and interpret the results. We solve the equation (|28p by the population 
dynamics technique, the technical difficulties and the precision of the method are discussed in appendix |Dj Let 
us stress at this point that the correctness of eq. (|28[) is guaranteed only in the limit of large graphs (N — > 00), 
unfortunately the cavity method does not give us any direct hint about the finite graph-size corrections. We start by 
the results for the regular random graphs, then consider ensemble of bi-regular graphs and after that we turn towards 
Erdos-Renyi graphs. Finally, we discuss the limit of large number of colors. 



Let us fix the number of colors q, vary the connectivity, and identify successively all the transitions that we shall 
encounter. For the sake of the discussion, we choose as a typical example the 6-coloring and we discuss later in details 
the cases, for different number of colors, where some transitions are missing or are arriving in a different order. We 
solved the 1RSB equation (f2T))) for regular graphs, where the distribution P l_l J is the same for every edge (ij) (see 
appendix|D|) and plot the curves for £(s) we obtained doing so in fig. |H We now describe the phase space of solutions 
when the connectivity is increased: 

1) At very low connectivities c < q, only the paramagnetic RS solution is found at all to. i.e. P{ip) = S(tp — 1/q). 
The phase space is made of a single RS cluster. 

2) For larger connectivities c > q, we saw in section llV C 3l that the 1RSB equations start to have nontrivial solutions 
with hard fields in an interval [— 00, m r ]. Interestingly, another nontrivial solution, without hard fields, can be 
found numerically in an interval [m s , 00] , and we shall call this one the soft-field solution. As the connectivity 




V. 



THE COLORING OF RANDOM GRAPHS: CAVITY RESULTS 



A. Regular random graphs 



15 




0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.02 0.04 0.06 0.08 0.1 0.12 0.14 



s s 

FIG. 4: Complexity as a function of the internal state entropy for the q-coloring problem on random regular graphs of 
connectivity c. The full line corresponds to the clusters where a finite fraction of hard fields (frozen variables) is present and 
the dotted line to the clusters without hard fields. The circle signs the entropically dominating clusters. Left: (q — 4, c = 9) is 
in the clustered phase; (q = 5, c = 13) is in a simple replica symmetric phase and (q = 5, c = 14) is in the condensed clustered 
phase. Right: results for 6-coloring for connectivities 17 (RS), 18 (clustered), 19 (condensed) and 20 (uncolorable). For 4-,5- 
and 6-coloring all the smaller connectivities are in the RS phase while all the larger one are uncolorable. 



increases, we find that m r increases while m s decreases, so that the gap [m r ,m s ] where no nontrivial solution 
exists it getting continuously smaller. 

However, there is no nontrivial solution at m = 1 for connectivities smaller then Cd (see fig. [4] for the example 
of the 6-coloring at c = 17). This means that the Gibbs measure @ is still extremal. In other words the large 
RS state still exists and is entropically dominant (its entropy fTf]) is noted by a circle in fig. HJ). Despite the 
fact that an exponential number of clusters of solutions exist and that the SP equations converge to a nontrivial 
result, a random proper coloring will almost surely belong to the large RS cluster. 

3) If the connectivity is increased at and above the clustering threshold Cd, a nontrivial solution with positive 
complexity £ is found at to = 1. In fig. 2J we see that this happens at Cd — 18 for the regular 6-coloring. 
At this point, the RS Gibbs measure @ ceases to be extremal and the single large RS cluster splits into 
exponentially numerous components. To cover almost all proper colorings we need to consider exponentially 
many clusters M ~ e N ^i m The probability that two random proper colorings belong to the same cluster 
is going exponentially to zero with the system size. The connectivity Cd is thus the true clustering (dynamic) 
transition. This is not, however, a thermodynamic phase transition because the 1RSB total entropy reduces to 
the RS entropy (fT4]) at to — 1 which is analytical in c. Thus the RS approach gives a correct number of solution 
and correct marginals as long as the complexity function at m = 1 is non-negative. 

4) For even larger connectivities c > c c , the complexity at £(m = 1) becomes negative, e.g. c c = 19 for 6-coloring. 
It means that the clusters corresponding to to = 1 are absent with probability one. The total entropy is then 
smaller than the RS/annealed one and is dominated by clusters corresponding to m* < 1 such that £(m*) = 0. 
The ordered weights of the entropically dominating clusters follow the Poisson-Dirichlet process (explained in 
appendix [Bj) . As a consequence, the probability that two random proper colorings belong to the same cluster is 
finite in the thermodynamic limit. Another way to describe the situation is that the entropy condenses into a 
finite number of clusters. This condensation is a true thermodynamic transition, since the total entropy is non- 
analytical at c c (there is a discontinuity in its second derivative with respect to connectivity). The condensation 
is analogous to the static (Kauzmann) glass transition observed in mean field models of glasses [gO, IH| . 

5) For connectivities c > c s (c s = 20 for 6-coloring) even the maximum of the complexity S(m = 0) becomes 
negative. In this case proper colorings are absent with probability going to one exponentially fast with the size 
of the graph, and we are in the uncolorable phase. 

It is useful to think of the growing connectivity as additions of the constraints into a fixed set of nodes. From this 
point of view the set of solutions which exists at connectivity c gets smaller when new edges are introduced and the 
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connectivity increased. This translates into the cartoon in the introduction (fig. [J) where all the successive transitions 
are represented. Finally, another important transition has to be considered: 

6) There is a connectivity c r beyond which the measure is dominated by clusters that contain a finite fraction of 
frozen variables. For the regular 6— coloring, c r = 19. We refer to this as the rigidity transition, by analogy 

with (z| [73. 

The presence or the absence of hard fields inside a given cluster is crucial: if a cluster contains only soft fields, then 
after the addition of a small but finite fraction of new constraints, its size will get smaller (or it will split). If, however, 
a cluster contains a finite fraction of frozen variable, then after adding a small but finite fraction of links the cluster 
will almost surely disappear. 

Since the connectivities of regular graphs are integer numbers, we define the dynamical threshold q as the smallest 
connectivity where a nontrivial 1RSB solution exists at m = 1, the condensation transition c c as the smallest connec- 
tivity where complexity at m = 1 is negative, c r the smallest connectivity where hard fields are present at m* and 
the coloring threshold c s as the first uncolorable case. The scenario described here is observed for all cases of the 
regular ensemble, although, since connectivities are integer, the transitions are not very well separated at small q. We 
summarize the results in table [U 

Note that for q > 3, the local RS stability discussed in section Ull C 31 is irrelevant in the colorable regime. The 
only subtle case being for 3— coloring of 5-regular graphs where the RS solution is only marginally stable, i.e. the 
spin glass correlation function goes to zero only algebraically instead of exponentially (from this point of view c = 5 
would correspond to the critical point well known in the second order phase transitions). More interesting cases will 
arise in the other ensembles of random graphs. 
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TABLE I: Left: The transition thresholds for regular random graphs: csp is the smallest connectivity with a nontrivial solution 
at m = 0; the clustering threshold Cd is the smallest connectivity with a nontrivial solution at m = 1; the rigidity threshold 
Cr is the smallest connectivity at which hard fields are present in the dominant states, the condensation c c is the smallest 
connectivity for which the complexity at m = 1 is negative and c s the smallest UNCOL connectivity. Note that 3— coloring 
of 5— regular graphs is exactly critical for that Cd — 5 + . The rigidity transition may not exist due to the discreteness of the 
connectivities. Right: Values of m* (corresponding to the dominating clusters), and in the range of [— 00, m r ] the hard-field 
solution exists, in the range [m 3 , 00] the soft-field solution exists. 



B. Results for the bi-regular ensemble 



The bi-regular ensemble allows us to fine-tune the connectivity while preserving the factorization of the 1RSB 
solution, which is crucial for the numerical precision. It is actually more correct to say that the solution is "bi- 
factorized" , as all the messages going from the nodes with connectivity c\ to C2 are the same and the other way 
around. The bi-regular ensemble allows us to describe with large precision two interesting cases, which reappear in 
the Erdos-Renyi ensemble and which are not present in the regular ensemble (again, due to the discrete nature of 
the connectivity). Let us remind here that bipartite graphs are always 2-colorable, but we consider only the color 
symmetric cavity solutions and that is why we get a nontrivial result from this ensemble. 
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FIG. 5: The complexity as a function of entropy for 4-coloring or bi-regular graphs. Left: 5-21-bi-regular graph, an example 
where the entropy is dominated by clusters with soft fields while the gap in the curve E(s) still exists. Right: 4-c-bi-regular 
graphs for c=36, 39, 42, 45. In all these cases the replica symmetric solution is locally unstable. In the dependence S(s) we 
see an unphysical branch of the complexity which is zoomed in the inset for c = 39. 



In fig. El the left picture is the result for the complexity as a function of entropy S(s) for 4-coloring of 5-21-bi-regular 
graphs. The replica symmetric solution on this case is locally stable. We see clearly the gap between the hard-field 
and the soft-field solution, and yet we are already beyond the clustering transition Cd] actually the system is in the 
condensed phase. This example is similar to what happens for the 4-coloring of Erdos-Renyi graphs. 

The second interesting case, the right hand side of fig. O is given by the results for S(s) for the 4-coloring of 
4-c-bi-regular graphs, which are RS unstable for c > 28. Both the clustering and the condensation transitions coincide 
with the RS instability Cd = c c = 28. The survey propagation equations have a nontrivial solution starting from 
cgp = 37. The rigidity transition is at c r = 49. Finally the coloring threshold is c s = 57. Qualitatively, the results for 
this 4-c-bi-regular ensemble are the same as those for the 3-coloring of Erdos-Renyi random graphs. 

We see that for c < 42 the gap between the hard-field (full line) and soft-field (dotted line) solution exists. For 
m > m s there is a non-physical nontrivial soft-field solution, the convex part of the line in the figure, zoomed in the 
inset. It means that for m < m r we actually can find two solutions depending if we start or not with a population 
containing enough hard fields. The unphysical branch survives even when the gap [m r , m s ] closes, see the example of 
c = 45 in the figure. 

We would like to stress at this point the enormous similarity of the soft-field part of the curve S(s) to the one in 
fig. 4 in ref. (3f|. Actually the variational results of (3f| should be very precise and relevant near to the continuous 
clustering transition (this is also case for the 3-coloring of Erdos-Renyi graphs or for 3-SAT). 



C. Results for Erdos-Renyi random graphs 



For Erdos-Renyi random graphs obtaining the solution of eq. (T2"5|) is computationally more involved as the solution is 
no longer factorized. In the population dynamics a population of populations has to be updated, which is numerically 
possible only for small populations, and so one has to be careful that the finite population-size corrections are small 
enough, see details in appendix[D] However, all the computations can be done with the same computational complexity 
as for the regular graphs for m — 0, the energetic zero temperature limit (section llV C 1|) . and for m — 1 ( appendix [Cj) . 
That is enough to obtain the SP, clustering, condensation and COL/UNCOL transitions (from which the first and 
last one was computed in 30]). We can also compute exactly when hard fields appear for m = 1, eq. ()CJ9|) . this 
transition is further studied in [64j . Finally, using the generalized survey propagation equation introduced in section 
IIV C 21 the rigidity transition can be computed quite precisely. 
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1. The general case for q > 3, discontinuous clustering transition 

The phase transitions in g-coloring of random Erdos-Renyi graphs for q > 3 are qualitatively identical to those 
discussed in the case of random regular graphs. We plot the results for the total entropy (number of solutions) and 
complexity (number of clusters which dominate the entropy) in the 4— and 5— coloring in fig. [5] 

At the clustering transition Cd the complexity becomes discontinuously positive, the large RS cluster suddenly 
splits in an exponential number of smaller ones. The total entropy S* + s* is given by the RS formula (fT4")) up to the 
condensation transition c c . At the condensation transition the complexity of the dominating clusters becomes zero, 
the total entropy s tot — s* < srs is given by the point where S(s*) = 0. The function s tot (c) is non-analytical at 
the point c c , it has a discontinuity in the second derivative. At the coloring threshold c s all the clusters of solutions 
disappear, note, however, that the total entropy of the last existing clusters is strictly positive (about a half of the 
total entropy at the condensation transition). That means that the COL/UNCOL transition is not only sharp but 
also discontinuous in terms of entropy of solutions. Note that the positive entropy has two contributions: the trivial 
and smaller one coming from presence of leaves and other small subgraphs, and the nontrivial and more important 
one connected with the fact, that the ground state entropy is positive, even in the uncolorable phase or for the random 
regular graphs. 

Finally we located the rigidity transition, when frozen variables appears in the dominating clusters. For 3 < q < 8 
this transition appears in the condensed phase. As the number of colors grows it approaches the clustering transition. 
All the four critical values Cd, c r , c c and c s are summarized in table HH values of csp and c r (m = 1) are given for 
comparison. 




FIG. 6: The 1RSB total entropy and complexity of the dominating clusters for 4- and 5-coloring of Erdos-Renyi random 
graphs. The complexity jumps discontinuously at the clustering transition a while the total entropy stays analytical. The 
complexity disappears at the condensation transition c c causing a non-analyticity in the total entropy. Finally the total entropy 
discontinuously disappears at the coloring threshold. Dashed is the RS entropy left for comparison. 



2. The special case of 3— coloring, continuous clustering transition 

The only case which is left to be discussed is the 3-coloring of Erdos-Renyi graphs. It is different from q > 3 
because the replica symmetric solution is locally unstable in the colorable phase (see section Ull C 3[) . The extremality 
condition underlying the RS assumption ceases to be valid because of the mechanism discussed in section llll C 31 with 
a divergence of the spin glass correlation length: the main difference with the previous cases is therefore that the 
clustering transition is continuous and coincide with the condensation transition. 

However, the phenomenology does not differ too much from the other cases: crs = Cd = c c = 4; the phase where 
the entropy is dominated by exponential number of states is thus missing and the complexity corresponding to m = 1 
is always negative (see fig. [7] left together with the dependence of the total entropy on the connectivity). Note that 
the curves X(s) for the 3-coloring have been already studied in [H, [39[ where the authors considered however only 
the range of connectivities c = [4.42,4.69] = [csp,c s ]. 
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Cr(m— 1) 
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4.66(1) 
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4.687(2) 


4.42(1) 
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8.353(3) 


8.83(2) 


8.46(1) 
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29.93(3) 
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33.45(5) 
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36.08(5) 


36.490(5) 


30.62(2) 
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10 


39.0(1) 


41.508 


42.50(5) 


42.93(1) 


35.69(3) 


41.508 



TABLE II: Critical connectivities Cd (dynamical, clustering), c r (rigidity, rearrangments) , c c (condensation, Kauzmann) and c s 
(COL/UNCOL) for the phase transitions in the coloring problem on Erdos-Renyi graphs. The connectivities csp (where the 
first non trivial solution of SP appears) and c r ( m= i) (where hard fields appear at m = 1) are also given. The error bars consist 
of the numerical precision on evaluation of the critical connectivities by the population dynamics technique, details are given 
in appendix |Pl 



All the results derived for the 4-coloring of 4-c-bi-regular bipartite graphs are quantitatively valid also here. We are 
thus not surprised by the fact that in interval c = [4, 4.42] the survey propagation algorithm gives us a trivial result: 
simply the maximum of the curve S(s) does not exist yet there is no nontrivial solution at m = 0. Yet, the entropy 
is dominated by finite number of largest clusters which do not contain hard fields. The two solutions (hard- field and 
soft-field) join at a connectivity around 4.55. Finally at c r = 4.66 the hard fields arrive to the dominating states (and 
in consequence to all others). 
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FIG. 7: Left: The total entropy for 3-coloring of Erdos-Renyi random graphs. The dashed line is the replica symmetric (also 
the annealed) entropy, left for comparison. The complexity at m = 1 is shown, it is negative for c > 4, however, for connectivity 
near to four it is very near to zero. Right: The values of parameter m* (E(m*) = 0) as a function of connectivity for q — 3, 4, 5 
and in the large q limit. The connectivity c is rescaled as (c — c c )/(c s — c c ). It is striking that for q > 3 the curves are so well 
fitted by the large q limit one. We are even not able to see the difference due to the error bars which are roughly of the point 
size. 



3. The overlap structure 



We now give some results about the overlap structure in the random coloring to elaborate the intuition about 
clusters. First, consider marginal probabilities ipl' a within a cluster a. Note that due to the color symmetry there 
exist another q\ — 1 clusters different only in the permutation of colors. We define the intra-cluster overlap of two 
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solutions (averaged over states) as 



(41) 



In the paramagnetic phase S = 1/q, otherwise we have to compute it from the fixed point of equation (|20|) . The 
overlap between two solutions which lie in two clusters, which differ just by permutation 7r of colors is 

Si = S J —^ + T~~~\ \ > ( 42 ) 

9-1 g(g-i) 

where j is the number of fixed positions in the permutation ir (in particular 5 q = 5, and 5\ = 1/q). In fig. [H] we show 
the overlap structure for 3- and 4-coloring. The probabilities that two random solutions have one of the overlaps can 
be computed from the Poisson-Dirichlet process described in appendix [Bl in fact this is not a self- averaging quantity 
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FIG. 8: Left: Overlaps structure in 3-coloring of random graphs as a function of connectivity. The intra-cluster overlap (upper 
curve) grows continuously from 1/3 at the clustering transition c = 4. In the figure from up there are S = S3, Si and So- Right: 
Overlaps structure in 4-coloring of random graphs as a function of connectivity. The intra-cluster overlap (upper curve) jumps 
discontinuously from 1/4 at the clustering transition c = 8.35. The probability that two random solutions belong to the same 
cluster, however, is zero between the clustering and condensation transition [8.35, 8.46]. In the figure from up there are 5 — S4, 
S'2, Si and 80 . 



D. Large q Asymptotics 

We give here the exact analytical large q expansion of the previous results. In the asymptotic computations the 
regular and Erdos-Renyi ensembles are equivalent (the corrections are of smaller order in q that the orders we give). 
We refer to the appendix [E] for the explicit derivation of the formulae. 

At large q a first set of transitions arises for connectivities scaling as qlogq: 

c SP =c r (m = 0) = q [log? + log log? + 1 -log 2 + o(l)] , (43) 
c r = c r (m=l) = q[\ogq + log\ogq + l + o(l)]. (44) 

q — >oo 

cgp was already computed in [3l| and c r is the rigidity transition. The clustering transition has to appear before the 
rigidity one Cd < c r . For all the finite q cases we looked at, Cd was between csp and c r . 
A second set of transitions arises for connectivities scaling as 2qlogq: 



c c = 2<3f log? -log # -2 log 2 + o(l), 
c s = 2glogg-logg- l + o(l). 



(45) 
(46) 
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The condensation thus appears very close the COL/UNCOL transition and both are very far from the clustering and 
rigidity transitions (those are on a half way in the phase diagram). 

We show also in appendix IE1 that for connectivity c = 2qlogq — \ogq + a, one has 

2qs(m) ~ 2 m log2, (47) 
2gS(m) ~ 2 m -2-77i2 m log2-a. (48) 

Since the RS free energy is correct until c c , which differers just by constant from c s , that means that for all connec- 
tivities bellow c c the number of solutions is correctly given by the replica symmetric entropy (| 14[) . Indeed, the value 
s(m = 1) can be obtain by a large expansion of eq. (| 14[) . 

In fig. [S] we plot the complexity of dominating clusters S* = E(m*), the total entropy s tot = S* + s*, and the 
physical value of m* as a function of connectivity c = 2qlogq — logg + a. Note that the properly scaled values of the 
total number of solutions at c c and c s , and the values c c , c s themselves, are already very close to those at q = 3,4, 5 
(see figs. [6] and fig. [7] left). The closeness is particularly striking for the values m* for q — 4 and q = 5 (see fig. [7] 
right). 

These formulae show that in the large q limit, near to the coloring threshold, it is the number of clusters which 
change with connectivity (i.e. a), and not their internal entropy (size). In the leading order, adding constraints near 
to the COL/UNCOL transition thus destroys clusters of solutions, but do not make them smaller: this is due to the 
fact that these clusters are dominated by frozen variables so that adding a link kills them most of the time. We also 
computed the entropy value at the condensation transition, and found s(m = 1) = log2/g. The entropy of the last 
cluster (exactly at the COL/UNCOL transition) is s = log2/2g. 




FIG. 9: Analytical result for the large q asymptotics close to the COL/UNCOL transition for c = 2qlogg — logg + a. Left: 
(rescaled) complexity versus (rescaled) internal entropy for different connectivities. The condensation transition appears for 
a = —2 log 2. The maximum of the complexity becomes zero at c a for a = — 1. Right: Total entropy (s tot ), complexity (£*) 
and the parameter m* versus a. Notice how the values for the total number of solution are already very close to those for finite 
low q in figs. [S] [7] 



VI. ALGORITHMIC CONSEQUENCES 

In this section we give some algorithmic consequences of our findings. First, we discuss the whitening procedure. We 
then introduce a random walk algorithm adapted from the Walk-SAT strategy and study its performance. We show 
in particular that the clustering/dynamical transition does not correspond to the onset of hardness in the problem and 
argue that it is instead the rigidity transition. Finally, we discuss the performance of the belief propagation algorithm 
in counting and finding solutions, and show that is works much better than previously anticipated. 



22 



A. The whitening procedure 

The whitening procedure as introduced in [65| can distinguish between solutions which belong to a cluster containing 
hard fields and those which do not. Generally whitening is equivalent to the warning propagation (version of belief 
propagation which distinguish only if a field is hard or not). Warning propagation for coloring was derived in [30| . 
Let us call u WJ = (1,0,0,..., 0) the hard field in the direction of the first color, i.e. in absence of node j the node i 
takes only the first color in all the colorings belonging to the cluster in consideration, and similarly for other colors. 
Denote u % ~*i — (0, 0, 0, ... , 0) if is not frozen in the cluster, we say that the oriented edge i — ^ j is then "white" . 
The update for it's follows from (T5)) 

]T u*^ + S r , a J - min I Y, <^ ) • ( 49 ) 

To see if a solution {sj} belongs to a cluster with frozen variables or not we initialize warning propagation with 
it*"*- 7 = 5 StSi , and update iteratively according to (|49[) until a fixed point is reached (the update every time converge, 
because starting from a solution we are only adding white edges). In the fixed point or all edges are white, then the 
solution {si} does not belong to a frozen cluster, or some of the edges stay colored (non-white), then the solution {si} 
belongs to a frozen cluster. Note that in the K-SAT problem (but not in general), whitening is equivalent to a more 
intuitive procedure, where the directed edged are not considered [66l . 

We wish to offer here an explanation of a paradox observed in 



671. 

66 , [62J • The SP algorithm gives information 



on the frozen variables in the most numerous clusters (m = 0). Yet, the solutions which are found by the standard 
implementation (decimation and SP plus Walk-SAT) do not belong to clusters with frozen variables, since they always 
give a trivial whitening result (all directed edges are white) [66|, [|57| . We suggest that the decimation strategy drives 
the system towards a solution belonging to a large cluster, which does not contain frozen variables. In this case, it 
is logical that the result of the whitening is trivial, as it is observed. We believe this is reason why no nontrivial 
whitenings are observed so far in the study of the K-SAT problem on large graphs. 

Note that beyond the rigidity transition this argument does not work anymore, since there all the clusters (for all 
m such that S(m) > 0) contain frozen variables. More precisely, for q > 9 we could in principle end up in soft-clusters 
even beyond the rigidity transition (since that one concerns only the dominant states), if this is possible is let for 
further investigation. Interestingly, in the coloring problem we have not been able to find solutions beyond the rigidity 
transition even with survey propagation algorithm (compare c r with the performance of SP in (30j). Further, more 
systematic, investigations have to be done about these issues, employing other strategies for the use of the survey 
propagation equations (for example the reinforcement [z3|). 



B. A Walk-COL algorithm to color random graph 



In this paper, we have computed the correct clustering transition c c i for the random coloring problem. Beyond this 
transition, Monte Carlo algorithms are proven not to reach equilibrium as their time of equilibration diverges [HI, |62| . 
It was often claimed, or assumed, that this point corresponds to the onset of hardness of the problem. However, 
the fact that the physical dynamics does not equilibrate just means that the complete set of solutions will not be 
correctly sampled — indeed Monte-Carlo experiments clearly display slow relaxation [75[ — but not that no solutions 
can be eventually found. This simple fact explains the results of [32j where a simple annealing procedure was shown 
to 3-color a ER graph beyond Cd = 4. 

In this section, we use a local search strategy which does not satisfy the detailed balance condition. Therefore, we 
do not expect to be able to find typical solutions, however it might be possible to find some solutions to the problem. 
The Walk-COL algorithm (8?| is a simple adaptation of the celebrated Walk-SAT (76|. More precisely, we adapted 
the method designed for satisfiability in [771 ]. Given a graph, and starting from an initial random configuration, we 
recursively apply the following procedure: 

1) Choose at random a spin which is not satisfied (i.e. at least one of its neighbors has the same color). 

2) Change randomly its color. Accept this change with probability one if the number of unsatisfied spin has been 
lowered, otherwise accept it with probability p. 

3) If there still are unsatisfied nodes, go to step 1) unless the maximum running time is reached 

The probability p has to be tuned in each different case for a better efficiency of the algorithm. Typically, values 
between 0.01 — 0.05 give good results. We shall now briefly discuss the performance of the algorithm, to illustrate the 
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two following points: (a) When the phase space is RS, we observe that Walk-COL finds a solution in linear time, (b) 
Even in the "complex" phase for c > c<j, the algorithm can find in some cases solutions in linear time. 

Concerning the first point, we tested the algorithm in the RS phase of regular random graphs for q — 3,4,5,6,7. 
In all these cases, we were able to color in linear time all the graphs of connectivities that correspond to a replica 
symmetric solution. In particular, the cases (q = 3, c = 5), (q = 5, c = 13), (q = 6, c = 17), (q = 7, c = 21), 
(q = 7, c = 22) are found to be colorable with the Walk-COL algorithm even if a nontrivial solution to the SP 
equations exists. 

Concerning the second point, we considered the 3— and 4— coloring of Erdos-Renyi random graphs. The results 
are shown in fig. [10] where the percentage of unsatisfied spins versus the number of attempted flips (averaged over 
5 different realizations) divided by N is plotted. We observe that the curves corresponding to different values of N 
superpose quite well (and that actually the results for N = 2 • 10 5 are systematically lower than those for N = 5 • 10 4 ) 
so that an estimation of the time needed to color a graph can be obtained. The connectivities of these graphs are 
beyond the dynamical transition (c^ = 4 for 3-coloring and Cd = 8.35 for 4-coloring). It would be interesting to 
systematically test Walk-COL, as it has been done for Walk-SAT in to derive the precise connectivity at which 
it ceases to be linear. 
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FIG. 10: Performance of the Walk-COL algorithm in coloring random graphs for 3— coloring (left) and 4— coloring (right). We 
plot the rescaled time (averaged over 5 instances) needed to color a graph of connectivity c. The strategy allows one to go 
beyond the clustering transition (a = 4 for 3-coloring and c<j = 8.35 for 4-coloring) in linear time with respect to the size of 
the graph. 



Already these results show that the dynamical transition is not a problem for the algorithms. This can also be 
observed in a number of numerical experiments for the satisfiability [73, [78| and the coloring [jj [Z9[ problems. 

We believe, however, that the rigidity transition plays a fundamental role for the average computational complexity. 
A first argument for this is that, for large graphs, it seems that all the known algorithms are only able to find solutions 
with a trivial whitening, i.e. solutions that belong to clusters without hard fields. Beyond the rigidity transition 
however, the clusters without hard fields become very rare (in the sense that the dominating clusters and all the 
smaller, more numerous ones, contain hard fields). For q > 9 maybe the connectivity where hard fields appear in 
clusters corresponding to £(m) = 0, m > 1 should be considered. This suggests that the known algorithms will not 
be able to find a solution beyond this point. 

A second argument is the following: local search algorithms are either attracted into a solution or stucked in a 
metastable state. These metastable states, in order to be able to trap the dynamics, have to contain a finite fraction of 
hard fields. Given an algorithm, determining which of these two situations happens is not only a question of existence 
of states, but also a question of basins of attraction and a theoretical analysis of such basins is a very difficult task 
so that the precise analysis of the behavior of local algorithms remains a hard problem. However, the metastable 
states are known, from the cavity formalism, to be much more numerous than the zero-energy states. Moreover the 
basin of attraction of a zero-energy state that contains hard fields does not probably differ much from those of the 
metastable state (while, on the other hand, the basin of attraction of a zero-energy state which does not contain hard 
fields might be slightly different and arguably relatively larger). It thus seems to us reasonable that local algorithms 
will get trapped by the metastable states beyond the rigidity transition. 

A similar conclusion was reached recently in [73[ where the recursive implementation of the Walk-COL algorithm 
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FIG. 11: Performance of the BP algorithm plus decimation in coloring random graphs for 3— coloring (left) and 4— coloring 
(right). The strategy described in the text allows to color random graphs beyond the clustering and even the condensation 
transitions. 

was studied and found to be somehow simpler to analyze. Again the strategy was found to be efficient (with linear 
time with respect to the size of the graph) beyond Cd but bellow c r . The precise algorithmic implications of the rigidity 
transition thus require further investigation, maybe in the lines of ff^L |78|. 

C. A belief propagation algorithm to color random graph 

Another consequence of our results, that we already discussed shortly in [42j], is that the standard belief propagation 
(BP) algorithm gives correct marginals until the condensation transition. It is actually a simple algebraic fact that the 
1RSB approach at m = 1 gives the same results for the marginals (average probabilities over all clusters) as the simple 
RS approach (see appendix [C|. Moreover the log- number of solutions in clusters cor resp onding to m = 1 is also equal 
to the RS one. This suggests to use the BP marginals (as was already suggested in [4Cj) and a decimation procedure 
to find proper colorings. Compared with the SP algorithm which has computational complexity proportional to q\ 
(factorial) the BP is only q. We have seen moreover that for large numbers of colors, the condensation point is very 
close to the COL/UNCOL transition, so that BP could be used in a large interval of connectivities. 

As a simple application, we tested how the straightforward implementation of the BP algorithm plus decimation 
allows one to find solutions of 3— and 4— coloring of random Erdos-Renyi graphs. Note that in the 3— coloring the 
clustered but not condensed phase is missing, so the argumentation above does not concern this case. The algorithm 
works by iterating the following procedure: 

(i) Run BP on the graph for a given number / of iterations. 

(ii) Consider the most biased variable, and color it with its most probable color. 

Two problems have to be mentioned. The first one is rather trivial: since at the beginning all colors are symmetric, 
the first color had to be put at random. The second one is more serious and concerns the convergence of BP. Indeed, 
we saw that there is local instability in the BP (replica symmetric) equations at connectivity c = 4 for the 3— coloring 
of random graphs, so that the BP equations do not converge. This seems to be a problem restricted to the 3— coloring, 
but even in the case of 4— or more coloring, the BP equations do not converge on the decimated graph when a finite 
fraction (typically few percent) of variables is fixed. The reason or that is to be understood. 

Nevertheless, since we merely want to design an effective tool to solve the coloring problem, we choose to avoid this 
problem by fixing the number of iteration I at each step and thus ignore the non-convergence. We tried the method 
on both the 3— and 4— coloring and obtained unexpectedly good results. We used the following protocol in the code: 
We first try to find a solution with I = 10. If we do not succeed, we restart with I = 20 and once more with I = 40. 
We tried that on 10 different samples for different connectivities. The probability to find a proper coloring with these 
conditions is shown in fig llll 

We thus observe that the BP strategy is able to find solutions, even beyond the condensation transition. This shows 
clearly that the decimation procedure is a nontrivial one, and that the problem is not really hard in that region of 
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connectivities. Note that the SP algorithm plus decimation has been shown to work in the 3— coloring very well until 
about 4.60 [30]: our results are thus very close to those obtained using SP. This rises again the question of the rigidity 
transition c r = 4.66, which might also be problematic for the decimated survey propagation solver. 

VII. CONCLUSIONS 

Let us summarize the results. They are perhaps best illustrated looking back to the cartoon in fig. [1] where the 
importance of the size of clusters is evidenced. We find that the set of solutions of the q-coloring problem undergoes 
the following transitions as the connectivity is increased: 

(i) At low connectivity, c < c^, many clusters might exist but they are very small and the measure over the set of 
solutions is dominated by the single giant cluster described by the replica symmetric approach. 

(ii) Only at the dynamical transition c<j the giant cluster decomposes abruptly into an exponentially large number 
of clusters (pure states). For connectivities Cd < c < c c , the measure is dominated by an exponential number of 
clusters. Yet, the total number of solutions is given by the replica symmetric entropy (|14j) . and the marginals 
are given by the fixed point of the replica symmetric equations (belief propagation) (H}. Starting from this 
transition the uniform sampling of solution becomes hard. 

(iii) At connectivity c c the set of solutions undergoes a condensation transition, similar to the one appearing in mean 
field spin glasses. In the condensed phase the measure is dominated by finite number of the largest clusters. The 
total entropy is strictly smaller than the replica symmetric one and has a discontinuity in the second derivative 
at Cq . 

(iv) When connectivity c s is reached, no more clusters exist: this is the COL/UNCOL transition. Note that the 
entropy of last existing clusters is strictly positive, and not given only by the contribution of the isolated nodes, 
leaves and other small subgraphs, the COL/UNCOL transition is thus discontinuous in entropy. 

This picture is very similar to the well-known scenario of the glass transition in temperature, with the dynamical 
and glass (Kauzmann) transition [6l[. In some cases, the main one being the 3-coloring of Erdos-Renyi graphs, 
the clustering and the condensation transition merge and a continuous transition take place at Cd = c c , which then 
coincide with the local instability of the replica symmetric solution. Interestingly the variational approach of [35j is 
very precise near to the continuous clustering transition. Since the 3-SAT problem behaves in the same manner, this 
solves the apparent contradictions between the results of [35[ and [24| . 

In addition to the transitions describing the organization of clusters, another important phenomenon concerning 
the internal structure of clusters takes place. A finite fraction of frozen variables can appear in the clusters (a frozen 
variable takes the same color in all the solutions that belong to the cluster). We found that the fraction of such 
variables in each cluster undergoes a first order transition and jumps from zero to a finite fraction at a connectivity 
that depends on the size of the cluster. In particular: 

(v) There exists a critical connectivity c r (rigidity/freezing) at which the thermodynamically relevant clusters — 
those that dominate the Gibbs measure — start to contain a finite fraction of frozen variables. 

The results above were obtained within the 1RSB scheme, but should not change when considering further steps 
of RSB (an exception might be the 3-coloring near to the clustering transition) . 

We discussed some algorithmic consequences of these transitions. First, the belief propagation algorithm is efficient 
in counting solution and estimating marginals until the condensation transition. More interestingly, it can also be 
used, just like survey propagation, together with a decimation procedure in order to find solutions as we numerically 
demonstrated. Secondly, the dynamical transition is not the one at which simple algorithms fail as we illustrated using 
the Walk-COL strategy. For the 3-coloring of ER graph, there is even a rigorous proof of algorithmic performance 
beyond Cd = 4 and until c = 4.03 (lo| . We argued that, instead, the rigidity phenomenon is responsible for the onset 
of computational hardness. This is a major point that we hope to see more investigated in the future. 

Our study opens a way to many new and promising investigations and developments. For instance, we wrote the 
equivalent of the survey propagation equations for general value of m, which has a particularly simple form for m = 1 
(j39[) . It would be interesting to use these equations to find solutions. The behavior at finite temperature and the 
performance of the annealing procedure are also of interest. It would furthermore be interesting to re-discuss other 
finite connectivity spin glass models like for instance the lattice glass models (5(| in the light of our findings. The 
stability towards more steps of replica symmetry breaking, or the super-symmetric approach should be further 
investigated. Finally, it would be interesting to combine the entropic and energetic approach to investigate the frozen 
variables in the meta-stable states. We hope that our results will stimulate the activity in these lines of thoughts. 
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APPENDIX A: STABILITY OF THE PARAMAGNETIC SOLUTION 



In this appendix, we show how to compute the stability of the paramagnetic solution towards the continuous 
appearance of a 1RSB solution. This happens, as usual for continuous transition, when the spin glass correlation 
length, or equivalently, the spin glass susceptibility, diverges. Obviously, the presence of the diverging correlation 
length invalidate the premise of the RS cavity method. Recall that the spin glass susceptibility is defined as 
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and can be rewritten for the present purpose as 
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d=0 



where we consider the average over graphs, in the thermodynamic limit, where spins sq and Sd are at distance d. The 
factor 7 d stands for the average number of neighbors at distance d, when d <C log N. Assuming that the limit for large 
d of the summands in (|A2[) exists (with the limit N — > oo performed first), we relate it to the stability parameter: 
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Then the series in (|A2[) is essentially geometric, and converges if and only if A < 1. 

Using the fluctuation-dissipation theorem we relate the correlation (soSd) c to the variation of magnetization in sq, 
caused by an infinitesimal field in Sd- Finally, using the fact that we perform the large- TV limit first, the variation 
above is dominated by the direct influence through the length-c? path between the two nodes, and this induces a 
"chain" relation: if the path involves the nodes (d, d — 1, ... ,0) we have 
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The stability parameter of the paramagnetic solution of the cavity equations towards small perturbations can be 
computed from the following Jacobian 



(A5) 
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which gives the infinitesimal probability that a change in the input probability tp^ 1 will change the output probability 
ip\~^° ■ The index RS says that the expression has to be evaluated at the RS paramagnetic solution. 

This matrix has only two different entries, all the diagonal elements are equal, and all the non-diagonal elements 
are also equal. As an immediate consequence all Jacobians commute and are thus simultaneously diagonalizable so 
that it will be sufficient to study the effect after one cavity iteration (one step in the chain). The matrix T has only 
two distinct eigenvalues, 
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The second eigenvalue corresponds to the homogeneous eigenvector (1, 1, 1) and describes a fluctuation changing 
all ' ,t = 1, ...,<?, by the same amount, and maintains the color symmetry. It is thus not likely to be the relevant 
one and we will see that indeed A 2 = 0. The first eigenvalue, however, is (q — l)-fold degenerate and its eigenvectors 
are spanned by (1, — 1, 0, 0), (0, 1, — 1, 0, 0), (0, 0, 1, — 1). The corresponding fluctuations explicitly break 
the color symmetry, and are in fact the critical ones. Using the cavity recursion ([4]), the two derivatives simply read 

r _ ri _ r - S] (v-r ) 2 

(A7) 

_ ri _ ( wr ) 2 ,, _ ., A 

so that the values of the two eigenvalues evaluated at the RS solution, where all ip are equal to 1/q, are 

Ai = j-s— , A 2 = 0. (A8) 

The stability parameter (|A3[) is thus A = 7A1 and the critical temperatures bellow which the instability sets in are 

T^{q, c) = -I/log - V ^ T+1 ) , T™(q, c) = -1/log (l - -j=^j . (A9) 

For regular and Erdos-Renyi graphs respectively. Thus at zero temperature the critical connectivities reads 

%f s tab = <7 2 -2<7 + 2, c^ stab = q 2 -2q + l. (A10) 

These results coincide perfectly with the numerical simulations of the cavity recursion of (32j. The analytical ex- 
pressions equivalent to (|A9|) were in fact first obtained in [EtJ in the context of the reconstruction problem on trees 
as an upper bound for the Gibbs measure extremality, and its connection with the statistical physics approach was 
explained in (53j . The case of bi-regular random graphs can be easily understood by noticing that two recursions 
should be considered, one with 7 = ci — 1, and one with 7 = c% — 1. As a consequence, the stability point is equivalent 
in this case to the one of a regular random graph with an effective connectivity equal to c = 1 + -J [c\ — 1) (02 — 1). 

Another instability appears when 7|Ai| > 1. This has been refered to as the modulation instability in [56j . Actually, 
this is the continuous instability towards the appearance of the anti-ferromagnetic order. Since at zero temperature 
Ai = (1 — then for connectivities larger than c mo d — q for random regular graphs (and c mo< j = q — 1 for Erdos- 

Renyi) the paramagnetic solution becomes unstable towards the anti-ferromagnetic order. However, this is correct if 
we study a tree with some given (and well chosen) boundary condition, but as noted in [56[, the anti- ferromagnetic 
solution in impossible on random graphs because of the existence of frustrating loops of arbitrary length. The cavity 
equations ((4J can actually never converge towards an anti-ferromagnetic solution of a random graph. Instead, when 
iterating, the fields oscillate between different solutions (thus the name modulation). 

In other words, although on a random tree with special boundary conditions there exists for c > c mot j a nontrivial 
solution to the cavity recursion (for the Gibbs state is no longer unique (|15[)). this solution does not exist on a 
random graph (and the Gibbs state is still extremal HU)). Note that this instability could anyway be a source of 
numerical problems that can be overcome considering that the distribution of cavity fields V(ip) over the ensemble of 
random graphs has to be symmetric in the color permutation. Another possibility is to randomly mix the new and 
old populations in the population dynamics so that the anti-ferromagnetic oscilations are destroyed. 



APPENDIX B: THE RELATIVE SIZES OF CLUSTERS IN THE CONDENSED PHASE 

In this section, we introduce the Poisson-Dirichlet point process and we shortly review some of its important 
properties. We also sketch its deep connection with the size of clusters in the condensed phase. Poisson-Dirichlet 
(PD) point process is a set of points {xi}, i — 1, . . . , 00 such that x\ > X2 > £3 > • ■ ■ and Xi = l. To construct 

these points we consider a Poisson process {yi}, i = 1, . . . , 00 of intensity measure y~ 1_m , < m* < 1 (note that 
this measure is not a probability measure). We order the sequence {yi} in such a way that y\ > y% > j/3 > . . . and 
define the PD point process as 

»■- Vj (Bl) 



If we identify the parameter m* with the value for which the complexity is zero S(m*) = then y^ is proportional 
to the number of solutions in cluster i (or to e~@ F for non-zero temperature), and Xi is the size on that cluster 
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relative to the total number of solutions. This connection was (on a non-rigorous level) understood in [80|, for more 
mathematical review see [8lj |. Note that that due to the permutation symmetry in graph coloring there are every 
time q\ copies of one clusters (different in the color permutation). 

To get feeling about the PD statistics let us answer in fig. [T2]to the following question: Given the value m* how 
many clusters do we need to cover fraction r of solutions, in other words what is the smallest k such that Yli=i x i > r ^ 
The mathematical properties of the PD process are very clearly reviewed in |82j|. To avoid confusion, note at this 
point that the PD process we are interested in is the PD(m*, 0) in the notation of [82j |. In the mathematical literature, 
it is often referred to the PD(0, 9) without indexing by the two parameters. Let us remind two useful results. Any 
moment of any Xi can be computed from the generating function 

E[cxp (-A/a*)] = e- x <t> m , (A) 1 " 1 ^. (A)"' , (B2) 

where A > and the functions m * and ip m * are defined as 

/CO 
e- Xx x- x - m 'dx, (B3) 

Vw(A) = 1 + m* [ (1 - e-^x- 1 -" 1 ' dx . (B4) 
Jo 

Another relation is that the ratio of two consequent points Ri = Xi + i/x,i, i = 1, 2, . . . is distributed as im*R l i m ~ . 
In particular its expectation is E[i?j] = im*/(l + im*) and the random variables Ri are mutually independent. We 
used these relation to obtain data in figure [T2l 
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FIG. 12: The sketch of size of the largest clusters for given value of parameter m*. The lower curve is related to the average 
size of the largest clusters as l/E[l/a;i] = 1 — m*. The following curves are related to the size of i largest clusters, their distances 
are E[i?i]E[i? 4 _i] . . . E[i?i](l - m*). 



APPENDIX C: THE 1RSB FORMALISM AT m = 1 AND THE RECONSTRUCTION EQUATIONS 

In this appendix we discuss the considerable simplification of eq. ([28| that is obtained by working directly at m = 1 . 
This was first remarked and proved in [53| when dealing with the tree reconstruction problem (for a discussion of a 
case where the RS solution is not paramagnetic see [Hj]). We first introduce the probability distribution of fields p0|) 
averaged over the graph 

k 

PW ee f d pv[pmp(Tjj) = j2 Q ^Y [ f[drp\^)6[^-T(m)] z , (ci) 
J k 1 j i=i 

where Z\ is computed from (|20|) as 

/fe q k 

rjd^p i (#)z = ^n( 1 -^) > ( C2 ) 
i=l j=l i=l 
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where if) = J d ip P(ip) ip. Generally, ip is a solution of the RS equation (| 1 2|) , which is easily seen from (l20l) , (|28|) . Since 
the RS solution is the paramagnetic one ip = 1/q the form of Z\ is particularly simple. 

In the next step, we want to get rid of the term Z$ in eq. (|CJ1[> . We thus introduce q distribution functions P s 

P.ty) = qtl> a P{1>) . (C3) 
It is then easy to show that if ip = 1/q then P s (ip) satisfies 

p.sW = E^iW E n^i s ) Js^-^(w})]f[d^p s ^), (C4) 

k si ...Sfc i=l " i=l 

where 

7r(Si|s) = q-(l-c-P) ' (C5) 

We solve eq. (|C4|) by population dynamics. In order to do this, one needs to deal with q populations of g-component 
fields, and to update them according to (|04|) . Is is only a functional equation and not a double- functional as the 
general 1RSB equation |28|) . Moreover the absence of the reweighting term Z™ simplifies the population dynamics 
algorithm significantly. Finally, it is important to note that the computational complexity here is the same as the one 
for regular and ER random graphs. 

A crucial theorem is also proven in (53j : the population dynamics of eq. (|CJ4|) has a nontrivial solution if and only 
if it converge to a nontrivial solution starting from initial conditions: 

= 6(r,s). (C6) 

This shows that when a paramagnetic solution is found, then no other solutions exist. 

Similar manipulations allow us to obtain the replicated free energy (|2ip which is equal in this case to the replica 
symmetric free energy ([9]), and the free energy ([26]) inside the corresponding states as 

-/W) = E Q ( fc ) E E-II^I 5 ) / ( lo S Z o) Hd^M), (C7) 
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where the normalization factors Zq, Zq are defined by ([6]) and ([8|). The complexity follows from ([24"]) . Since the 
replicated free energy 1) is equal, according to ([27]) . to the total free energy, we showed the statement used 
several time in the paper, i.e. the total free energy (entropy) at m = 1 is equal to the replica symmetric free energy 
©. 

Another important point is that one can write the recursion separating the hard and soft fields. In general, at zero 
temperature, we can write the distribution P q (ip s ) in eq. (|C4|) as 

?rM = XX, s <KV' s -i) + (i-E^,s)W)' (C8) 



Plugging this to eq. (|C4[) and taking into account the initial condition (|C6|) and color symmetry, we see that fi q ^ s — 
qr]S(q,s), where i] satisfies 

w=Eex(*)E(- i ) m C; 1 )( 1 -^) fc - (C9) 

fc rn=0 V / \ 1 / 

On ER graphs, the sum can be performed analytically and one finds 



qij= [l-c-^f 1 . (CIO) 



This equation can be solved iteratively starting from ?y lmt — 1/q. It is a very simple equation, as the one obtained 
for m = 0, which gives us a very efficient way to compute the fraction of hard fields at m — 1 for both regular and 
ER graphs. Indeed rj is larger that zero only above a certain average connectivity c r (m — 1). 



30 



APPENDIX D: NUMERICAL METHODS 



In this section, we detail the numerical methods we used to solve the 1RSB equations (|20I28[) . and the procedures 
used to generate the data. We use a population dynamics method, as introduced in [H,[13|, and model the distribution 
P l_>3 '('0'"^J) by a population of N vectors To compute F l_> -> ; (<0 l— ^ ) knowing the p*- 1 -*^*-**) for all incoming 

k we perform the 1RSB recursion in eq. (|20[) in two steps: (i) first we compute the new vectors ip l ~^ J using the simple 
RS recursion in eq. (|3~lj) (this is the iterative step) and (ii) we take into account the weight (Z^ 3 )™ 1 for each of the 
vectors (this is the reweighting step). For the reweighting we tried different strategies, two of them perform very well. 

a) For every field ip in the population, we keep its weight Zq. We then compute the cumulative distribution of 
weights Zq and sample uniformly the incoming fields. Using dichotomy we generate a random fields with its 
proper weight in 0(log(iV)) steps. A complete iteration thus takes 0(N log A) steps. 

b) We compute N new vectors and then we make a new population when we clone some of them while erasing 
others so that in this new population each field is present according to its weight (in principle, one can even 
change the size of the population, although we have not implemented this strategy). This second approach can 
be implemented in linear time (generating an ordered list of random numbers is a linear problem, see 83]), but 
is a bit less precise as we introduce redundancy in the population. 

We finally choose to use the second strategy, as we observed that it performs almost as good at the first one (for a 
given size of population) while it was much faster, so that, for given computer time, it allows a better representation 
of the population. We also force the population to be color-symmetric by adding a random shift of colors in the 
incoming messages. This is needed in order to avoid the anti-ferromagnetic solution. The learned reader will notice 
that this is equivalent to solving a disordered Potts glass instead of a anti-ferromagnet model. Indeed the fact that 
an Ising anti-ferromagnet on a random graph is equivalent to an Ising spin glass was already noticed [84| . 

Another important issue is the presence of hard fields. In fig. U3l are histograms of the first component of the vectors 
in the population for 3- and 4-coloring of 5- and 9-regular random graphs respectively. It is interesting to see how 
they peak around fractional values due to the presence of hard fields (see the three upper one). Maybe even more 
interesting are the lower one where no hard fields are present. However, since there are soft fields with values 1 — e, 
where e can be almost arbitrary small, one cannot see from these picture the absence of frozen variables. For the case 
c = 9, q = 4, to = 0.8 for instance, the presence of the quasi-hard fields makes the distribution clearly concentrate on 
values around one, zero and half (note however that the amplitude — on a logarithmic scale — is far less important). 
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FIG. 13: Histograms of the first component of the cavity field tpx, i.e. the probability that a node takes color one. Notice 
the logarithmic y-scale. Frozen fields (ipi — 1) are present in the solution for the three upper cases; there are delta-peaks on 
0,1,1/2 and other simple fractions depending on q. Notice that even when frozen fields are not present, there are many almost 
frozen fields (the distributions only concentrate around 0,1,1/2 and other fractions). 



The quasi-hard fields are therefore very hard to distinguish numerically from the true hard ones. This is evidenced 
on fig. [H] where we plot the fraction of hard fields computed using the expression (|3"8|) together with a numerical 
estimate made by population dynamics without the separate hard/soft implementation. We show that the fraction 
of fields of value 1 — e is not zero in regions where we know that there are no hard fields even for e = 10~ 20 . This 
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demonstrates the presence of quasi-hard fields, with e going to zero as the critical m is approached. This transition 
is further studied in [6^ |. 
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FIG. 14: Fraction of the hard and the quasi-hard cavity fields qrj (a field is quasi-hard if %p > 1 — e) in the 4— coloring of 9— regular 
graph. The bold line is obtained with the analytical computation of the fraction of hard fields and the dot corresponds to the 
threshold m r . 

An important simplification of the 1RSB update (|20[) arises when we consider the soft and the hard fields separately. 
The fraction of hard fields can be computed using the generalized SP equation (f38|) . provided the ratio Z™ ft / Z™ aTA is 
computed. This considerably reduced the size of the population as only soft field has to be kept in memory. Another 
way to further speed up the code is to generate directly soft fields with a uniform measure instead of waiting for 
them to come out from the 1RSB recursion. Indeed they might be quite rare in the region of small m and one can 
spend a considerable amount of time before being able to sample them correctly. Generating the soft fields with a 
uniform weight turns out to be rather easy using the following method: (i) Choose two random colors q\ and q2- (ii) 
Perform the usual recursion (|31|) in order to have a new vector but forbid incoming hard fields to q\ and q2- (iii) To 
obtain a uniform soft field generator, the resulting field should be weighted by 1/u) where s is the number of non-null 
components in the vector. This is specially useful in the case of Erdos-Renyi graphs. 

The formula for the free entropy $> s (m) (|32|) also simplifies in this case. Consider a given site i; the site free entropy 
term can be split into three parts when (i) the field is hard, (ii) the total field is soft and (iii) the field is contradictory. 
Then 

= log (phard(Zhard) m + Psolt {Z solt ) m ) , (Dl) 

where Phard/Psoft are the probabilities that the total field is frozen/soft, and are given by the SP recursion. Indeed 
the probability that the total field is not contradictory (p na rd + Psoft) is the denominator in eq. (|35p while p na rd is the 
numerator of eq. (|35[) . The link part can also be simplified using the fact that contradictions arise when two incoming 
frozen messages of the same color are chosen, so that 

= ^g (pno contr(^no contr)" 1 ) , (D2) 

where p no contr is simply (1 - qrf^rf^ 1 ). 

For m — the formula further simplify as Z™ ald — Z™ ft = Z™ contl . = 1 so that 

*S = log^c-iy^Jnti-a+i)^]^, (D3) 

$y = log (1 - ^V^ 1 ) . (D4) 

This is precisely what was obtained within the energetic cavity approach in [301 ] . The numerical population dynamics 
implementation with mixed hard/soft strategy is therefore as precise as it could be since we obtain the exact evaluation 
in the m = case. This simple computation also demonstrates how one can recover the energetic zero temperature 
limit from the generic formalism. 
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Finally, we obtain the function $ s (m). We fit this function using an ansatz <I> s (m) = a + b2 m + c3 m . . . and then 
perform the Legendre transform to obtain the entropy and complexity. It is also possible to compute directly the 
complexity from the population data using the expression of the derivative of the potential directly in the code. Both 
methods lead to very good results. We show an example of the raw data and their fit in fig. [D] where the data have 
been obtained with relatively small population (N = 5000) but where the mixed strategy separating the hard and 
soft fields have been used. For the purely soft-field branch, we used N = 50000. It took few hours up to few days to 
generate these curves on present Intel PCs. 
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FIG. 15: The numerical results for the free entropy (|30[) and its fit with a function a + b2 x + c3 x ... for the 6-coloring of 19-regular 
graphs and the 4-coloring of 9-regular graphs. Circles give the analytical results at m = and m = 1. On the right parts, 
we present the complexity versus internal entropy with the numerical points and the Legendre transform of the fit of the free 
entropy. The analytical result for E ma:r is also shown. 

In the case of bi-regular random graphs, one needs two different populations: one for the fields going from nodes 
with connectivity C\ and one for the fields going from nodes with connectivity ci- Then each iteration for population 
1 (resp. 2) should be performed using as incoming messages the vectors of population 2 (resp. 1.). Again, one can 
separately perform the recursion for the hard-field fractions in both population. 

The case of Erdos-Renyi random graphs is more involved, as one needs a large number iV pop of populations, each 
of them of size N. In this case, using the separate hard/soft fields implementation and the formulae (|D1ID2|) for 
complexity is crucial, as it allows a good precision even for smaller population sizes. We used typically 2N pop /c « 
(1 — 3) • 10 3 and N » (1 — 3) • 10 2 . The error bars in table |TT] are computed from several independent runs of the 
population dynamics. In each case we were able to make the equilibration times and the population sizes large enough 
such that by doubling the time or the population size we did not observed any significant systematic changes in the 
average results. 



APPENDIX E: HIGH-q ASYMPTOTICS 



The quenched averages in the large q limit are the same for the regular and Erdos-Renyi graphs and we thus consider 
directly the regular ensemble of connectivity c = k + 1. The appearance of a nontrivial 1RSB solution for m = 0, 
which correspond to csp, was already computed in [3l| and reads 



c r (m = 0) = 
r\ d (m = 0) = 



while the coloring threshold is [3l| 



csp 
1 " 

q 



l 



q [log(7 + loglogg + l-log2 + o(l)] , 



1 



logg 



1 



log q 



(El) 
(E2) 



c s = 2q log q - log q - 1 + o(l) 



(E3) 



We now show how the connectivity where a solution with hard fields at m = 1 first appears, and how the complete 
free entropy 5> s (m) (f3TJ)) can be computed close to the COL/UNCOL transition 
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1. The appearance of hard fields at m — 1 



We first show that the correct scaling for the appearance of hard fields at to = 1 is 

k = q [log q + log log q + a] . (E4) 

and compute the value of a. In the order 0(q) we can write also k — (q — l)[log(g — 1) + loglog(g — 1) + a]. 
The starting point is the equation (|C9|) , with Qi(ar) = 8{x — k). In the large q limit the fraction of hard fields is 
fi(q, k) = qi](q, k) = 1 — 0(q, k), where 9(q, k) — o(l) is the fraction of soft fields. We check self-consistently at the end 
of the computation that only the two first terms of (IC9|) are important. Then we have 

t i(q,k) = l-(q-l)(l--^ [ fi(q,k) S j = 1 - (q - l)e" i ^ . (E5) 

A self-consistent equation for 8(q, k) follows 

log(g - l)6(q, k) = (q- lf(^ e - a , (E6) 

which is solved by 9(q, k) = 7(a)/ log(g — 1) where 

7(a)e~ 7(Q) = e" a . (E7) 

The maximum of the left hand side is 1/e for 7(a) = 1. It means that a solution of (|E7|) exists for a > 1. Finally the 
hard fields appear in the 1RSB solution for to = 1 at connectivity 

c r (m = 1) = g[logg + loglogg+l + o(l)]. (E8) 

The clustering transition a should be between csp and c r (m = 1) as this is what we observed for finite q. We see 
that csp and c r (m = 1) differs only in the third order and both are very far from the coloring threshold and also from 
the condensation transition as we show in section IE 21 It would be interesting to compute a large q expansion of the 
connectivity at which the hard fields appears in all the clusters (for all to such that E(m) > 0). Together with our 
conjecture about rigidity being responsible for the computational hardness that might give a hint about the answer on 
the long-standing question [221 ] : "Is there a polynomial algorithm and e such that the algorithm would color random 
graphs of average connectivity (1 + e)qlogq for all large g?" 



2. The condensation transition 



To compute the large-g asymptotic of the condensation transition, we first need to derive the large-g expansion of 
the free entropy ([3H]) in the connectivity regime c = 2qlogq. Let us show self-consistently that the following scaling 
is relevant for the condensation transition in the large q limit 

c s = 2q\ogq~ -f\ogq + a, (E9) 
IB . . 

V = 2> ( E1 °) 

q q A 

and compute the constants 7, a, B. Using the above scaling, the function w{rj) ([33]) is dominated by the first two 
terms in numerator and denominator, and reads in the first two leading orders 1 — qw(rf) qe 2 so that 

^-r^ +0 ( ,J f) (E11) 

independently of 7, a, and B. To take into account the reweighting we expand eq. {55J in the two leading orders 



V=--^+0( 1 ^). (E12) 
q 2q 2 Z™ \ q 3 1 



Note that almost all the incoming fields are hard, i.e. have one component of value 1. Since there are on average only 
2_Blogq incoming soft fields, the leading order of the hard- field reweighting (the normalization in eq. ([3])) is different 
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from 1 with a probability only O {log qj 'q). Similarly, almost all the soft fields have two nonzero and equal components. 
The normalization in eq. Q is thus almost surely 2, thus the average reweighting factor of the soft fields is 



O 



\ogq 



Finally, 



q 2q 2 



logg 



(E13) 



(E14) 



Therefore the constant B in (|E10|) is B = 2 m /2, independently of 7 and a. 

The computation of the complexity requires the next order in the hard-field reweighting. Indeed the normalization 
in ((4]) might not be 1 but 1/2; and this happens when there is a soft field arriving of the color corresponding to the 



hard field in consideration. The probability of this event is 



gg(j-ga) _ 



0(^2_). The hard-field reweighting is thus 



2c(l-qr,) 



2c (i -cm) 1 



logg 



(E15) 



We now expand the replicated free energy ([3H)l in the large q limit and regime (|E9[) . Remind that from (JSJ [H]) 



*.(m) = log(Z5)' 



(E16) 



The averages are over the population in the sense of (|2ip . 

The site free energy is the logarithm of the average of the total field normalization. This average can be split into 
three parts when (i) the total field is a hard field, (ii) the total field is a soft field and (iii) the total field is contradictory 
(and its normalization zero). The probability that the total field is not contradictory is the denominator in eq. (|35[) 



9 -i 

1=0 



q 

1 + 1 



where again only the first two terms are relevant in the expansion. The site free energy is then 



(E17) 



log g(r,) +log [qw(r))ZJ? + (l-qw{r)))Z ; 

q(q-i 



log 



9(1 - vY 



-(i-2 V y 



log 



1 1 2c(l-<p7) 
2q q 



1 - 



1 



2q 



where 



log 



q[l-vY 



«(?-!) 



[l-2ry] c 



log q + c log [1 - 77] + log 1 



q-l 



1 - 2-q 



log q + c log 



2™ 

v 



log 



1 - 



2q 



(E18) 

(E19) 
(E20) 



To compute the link contribution in (|E16j) we need to consider two fields ipl^ and ipi~* 1 and to compute 



(E21) 



There are three different cases: 

1. Two hard fields are chosen, then Zq = with probability qij 2 (this is of order 1/q) and Zq = 1 with probability 
q(q — l)r) 2 (this is of order 1). 



2. Two soft fields are chosen then Z^ 



— 1 with probability (1 — qrf) 2 (this is of order 1/q 2 ), all other situations 
being 0(l/q 3 ). Let us remind that the dominant soft fields are two-component of type (1/2, 1/2, 0, 0, . . . ). 

3. One hard and one soft field is chosen, then Z^ = 1 with probability 277(1 — qrf)(q — 2)/q, and Zq =1/2 with 
probability 4rj(l — qrf) (this is of order 1/q 2 )- 
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On average, one thus obtain for the link contribution 



log(^) =log 



(1 - qr, 2 - 4r7 (1 - qr))) l m + 4»j (1 - gr/) — 



(E22) 



Putting together the two pieces (|E18|) and (|E22[) , expanding 77 according to (|E14[) and considering only the highest 
order in c, we can finally write the free energy as 



o I - 



c 2 m — 2 c 
Mm) = lo g9-^ + ^^-^ 

The internal entropy s(m) and the complexity £ = 4> s (m) — ms(m) are then 

5<& s (m) 2 m log2 



s(m) 



dm 



2q 



c 2 m -2-m2 m log2 c 
E(m) = log 9 -^ + Yq ^ 



(E23) 

(E24) 
(E25) 



and the complexity is thus zero for cs=o = 2glogq — \ogq — 2 + 2 m [1 — mlog2] + o(l). In particular, one has for the 
coloring and the condensation thresholds 



c E =o(™ = 0) = 2q\ogq- log?- l + o(l), 
ca=o(m = l) = 2qlogg - log 9 - 21og2 + o(l) . 



For connectivity c = 2qdog<7 — log? + a, one gets 



2qs(m) ~ 2 m log2, 
2< ? £(m) ~ 2 m -2-rn2 m log2-a. 



(E26) 
(E27) 



(E28) 
(E29) 
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